Home / Specs / Neural Prediction
Core v1.0.0

Neural Prediction

ONNX transformer-based swipe typing engine

Neural Swipe Prediction System Specification

Feature: ONNX Transformer-Based Swipe-to-Text Prediction

Status: 🟢 IMPLEMENTED (P0 bugs resolved, P1-P2 remaining)

Priority: P0 (Core functionality)

Assignee: N/A

Date Created: 2025-10-20

Last Updated: 2025-12-04


TODOs

✅ RESOLVED - Critical Systems (P0)

All critical systems are fully implemented and verified as of 2025-12-04:

| Bug # | Issue | Resolution | Status |

|-------|-------|------------|--------|

| #257 | LanguageDetector missing | Implemented in data/LanguageDetector.kt (313 lines) | ✅ FIXED |

| #259 | NgramModel missing | Implemented in NgramModel.kt (350 lines) | ✅ FIXED |

| #262 | WordPredictor missing | Implemented in WordPredictor.kt (782 lines) | ✅ FIXED |

| #263 | UserAdaptationManager missing | Implemented in data/UserAdaptationManager.kt (291 lines) | ✅ FIXED |

| #273 | Training data lost on close | SQLite database implementation | ✅ FIXED |

| #274 | ML training system | External pipeline by design (ADR-003) | ✅ ARCHITECTURAL |

| #275 | Async prediction | Kotlin coroutines (ADR-004) | ✅ ARCHITECTURAL |

| #276 | Advanced gesture analysis | Neural network auto-learns features (ADR-005) | ✅ ARCHITECTURAL |

Initialization Order Bug (2025-11-14): Fixed race condition in CleverKeysService.kt where WordPredictor was initialized before its dependencies.

⚠️ Outstanding Issues (P1-P2)

| Bug # | Issue | File | Impact | Est. Time |

|-------|-------|------|--------|-----------|

| #270 | Time delta calculation | SwipeMLData.kt | Training timestamps may be wrong | 1 hour |

| #271 | Consecutive duplicate filtering | SwipeMLData.kt | Noisy training data | 1 hour |

| #277 | Multi-language expansion | OptimizedVocabularyImpl.kt | Only English fully tested | 8-12 hours/language |


1. Feature Overview

Purpose

Pure ONNX neural transformer architecture for converting swipe gestures into ranked word predictions. This is a complete architectural replacement of the original CGR (Continuous Gesture Recognition) system.

Key Advantages Over Legacy CGR

Architecture Comparison

Original (Java - CGR System):

Swipe → Manual Feature Engineering (40+ features) →

Template Matching → Dictionary Lookup →

Statistical Scoring → Predictions

Modern (Kotlin - ONNX System):

Swipe → Feature Extraction (6 features) →

Transformer Encoder → Beam Search Decoder →

Vocabulary Filter → Predictions

Current Status (Updated: 2025-11-14)


2. Requirements

Functional Requirements

FR-1: Swipe Input Processing

FR-2: ONNX Encoder Inference

FR-3: Beam Search Decoder

FR-4: Post-Processing

FR-5: Training Data Collection (COMPLETE)

FR-6: Multi-Language Support (FRAMEWORK READY)

FR-7: User Adaptation (COMPLETE)

Non-Functional Requirements

NFR-1: Performance

NFR-2: Accuracy

NFR-3: Resource Usage


3. Technical Design

System Architecture

┌───────────────────────────────────────────────────────────┐

│ USER INTERACTION │

└───────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────┐

│ SwipeDetector.kt (Touch Events) │

│ - ACTION_DOWN: Start gesture │

│ - ACTION_MOVE: Collect points │

│ - ACTION_UP: Trigger prediction │

└───────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────┐

│ SwipeInput.kt (Data Encapsulation) │

│ data class SwipeInput( │

│ coordinates: List<PointF>, │

│ timestamps: List<Long>, │

│ touchedKeys: List<Key> │

│ ) │

└───────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────┐

│ OnnxSwipePredictorImpl.kt (Core Pipeline) │

│ ┌─────────────────────────────────────────────────┐ │

│ │ SwipeTrajectoryProcessor (Feature Extraction) │ │

│ │ - smoothTrajectory() │ │

│ │ - calculateVelocities() │ │

│ │ - calculateAccelerations() │ │

│ │ - normalizeCoordinates() │ │

│ │ - detectNearestKeys() │ │

│ │ - padOrTruncate(150 points) │ │

│ └─────────────────────────────────────────────────┘ │

│ │ │

│ ▼ │

│ ┌─────────────────────────────────────────────────┐ │

│ │ runEncoder() (Transformer Inference) │ │

│ │ - Create input tensors [1,150,6], [1,150] │ │

│ │ - Run ONNX encoder model │ │

│ │ - Output: memory [1,150,256] │ │

│ └─────────────────────────────────────────────────┘ │

│ │ │

│ ▼ │

│ ┌─────────────────────────────────────────────────┐ │

│ │ runBeamSearch() (Character Decoding) │ │

│ │ - Initialize beams with SOS token │ │

│ │ - BATCHED decoder inference (beam_width=8) │ │

│ │ - Expand hypotheses, track scores │ │

│ │ - Terminate on EOS or max_length │ │

│ │ - Return top N candidates │ │

│ └─────────────────────────────────────────────────┘ │

│ │ │

│ ▼ │

│ ┌─────────────────────────────────────────────────┐ │

│ │ SwipeTokenizer (Token ↔ Character) │ │

│ │ - decode([2,5,8,12,3]) → "hello" │ │

│ │ - Special tokens: SOS=2, EOS=3, PAD=0, UNK=1 │ │

│ └─────────────────────────────────────────────────┘ │

└───────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────┐

│ OptimizedVocabularyImpl.kt (Dictionary Filter) │

│ - Check words against dictionary (English only) │

│ - Filter OOV (out-of-vocabulary) predictions │

│ - Bug #277: Multi-language support missing │

└───────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────┐

│ PredictionResult.kt (Output Format) │

│ data class PredictionResult( │

│ words: List<String>, // ["hello", "hallo"] │

│ scores: List<Int>, // [950, 850] │

│ confidences: List<Float> // [0.95, 0.85] │

│ ) │

└───────────────────────────────────────────────────────────┘

┌───────────────────────────────────────────────────────────┐

│ SuggestionBar.kt (UI Display) │

│ - Show top 3 predictions │

│ - Highlight by confidence color │

│ - Handle user selection │

└───────────────────────────────────────────────────────────┘

Critical Data Flows

1. Feature Extraction Pipeline:

// Input: SwipeInput

val rawCoords = swipeInput.coordinates // [(100,250), (102,251), ...]

val timestamps = swipeInput.timestamps // [1728567890123, 1728567890140, ...]

// Step 1: Smoothing (moving average, window=3)

val smoothed = smoothTrajectory(rawCoords) // Reduce noise

// Step 2: Velocity (first derivative)

val velocities = calculateVelocities(smoothed, timestamps)

// Formula: velocity = distance / time_delta (pixels/sec)

// Step 3: Acceleration (second derivative)

val accelerations = calculateAccelerations(velocities, timestamps)

// Formula: accel = velocity_delta / time_delta (pixels/sec²)

// Step 4: Normalization [0,1]

val normalized = normalizeCoordinates(smoothed)

// normalized_x = x / keyboardWidth, normalized_y = y / keyboardHeight

// Step 5: Nearest key detection

val nearestKeys = detectNearestKeys(normalized)

// Returns character indices: a=4, b=5, ..., z=29

// Step 6: Padding to 150 points

val (features, keys, mask) = padOrTruncate(

normalized, velocities, accelerations, nearestKeys, 150

)

// Output tensors:

// trajectory_features: [1, 150, 6] (x, y, vx, vy, ax, ay)

// nearest_keys: [1, 150] (character indices)

// src_mask: [1, 150] (attention mask - 1=real, 0=padding)

2. Beam Search Algorithm:

// Initialize beams with SOS token

var beams = listOf(Beam(tokens=[SOS], score=0.0))

for (step in 0 until max_length) {

// BATCHED inference: all beams in single model call

val batchSize = beams.size

val inputIds = beams.map { it.tokens }.toBatchTensor() // [batch, seq_len]

// Run decoder model

val logits = runDecoder(memory, inputIds) // [batch, vocab_size]

// Expand each beam

val newBeams = mutableListOf<Beam>()

for ((beamIdx, beam) in beams.withIndex()) {

val topK = logits[beamIdx].topK(beam_width) // Get top K tokens

for ((tokenIdx, logProb) in topK) {

if (tokenIdx == EOS) {

finishedBeams.add(beam.copy(score = beam.score + logProb))

} else {

newBeams.add(Beam(

tokens = beam.tokens + tokenIdx,

score = beam.score + logProb

))

}

}

}

// Keep top beam_width beams by score

beams = newBeams.sortedByDescending { it.score }.take(beam_width)

// Early termination if all beams finished

if (beams.isEmpty()) break

}

// Return best finished beams

return finishedBeams.sortedByDescending { it.score }

3. Token-to-Character Decoding:

// Token indices → Characters

val CHAR_MAP = mapOf(

4 to 'a', 5 to 'b', 6 to 'c', ..., 29 to 'z',

30 to ' ', 31 to '\'', 32 to '-'

)

fun decode(tokens: List<Int>): String {

return tokens

.filter { it !in listOf(SOS, EOS, PAD, UNK) }

.mapNotNull { CHAR_MAP[it] }

.joinToString("")

}

// Example:

// tokens: [2, 8, 5, 12, 12, 15, 3] (SOS, h, e, l, l, o, EOS)

// decoded: "hello"

Model Architecture

Encoder Model (swipe_model_character_quant.onnx):

- Features: (x, y, velocity_x, velocity_y, accel_x, accel_y)

- Character indices detected under each point

- Attention mask (1=real point, 0=padding)

- Encoded representation of swipe trajectory

Decoder Model (swipe_decoder_character_quant.onnx):

- Next character probabilities


4. Implementation Plan

Phase 1: Critical Bug Fixes (2-3 days)

Priority: P0 - Fix data loss and training bugs

- Create SQLite database schema

- Migrate SwipeMLDataStore to use Room/SQLite

- Implement batch insert for performance

- Add data export/import functionality

- Time: 4-6 hours

- Fix addRawPoint() timestamp logic

- Use proper millisecond differences

- Time: 1 hour

- Add logic to skip duplicate keys in sequence

- Preserve only direction changes

- Time: 1 hour

Phase 2: Multi-Language Support (1-2 weeks)

Priority: P1 - Enable multiple languages

- Add language detection (Bug #257 - 313 lines to port)

- Per-language ONNX models

- Per-language vocabularies

- User dictionary support

- Language switcher UI

- Time: 8-12 hours implementation + model training

Phase 3: User Adaptation (2-3 weeks)

Priority: P1 - Personalization

- Port UserAdaptationManager.java (291 lines)

- Frequency tracking

- User-specific corrections

- Personalized scoring adjustments

- Time: 12-16 hours

Phase 4: Training Infrastructure (4-6 weeks - External)

Priority: P0 - Enable model improvements

- Python/PyTorch training pipeline

- Data preprocessing scripts

- Model architecture definition

- Training loop with validation

- ONNX export scripts

- Time: 2-3 weeks (full infrastructure)

- Note: This is INTENTIONAL external training (ADR-003)


5. Testing Strategy

Unit Tests

Feature Extraction Tests:

@Test

fun smoothTrajectory reduces noise() {

val noisy = listOf(

PointF(100f, 100f),

PointF(105f, 102f), // Noise spike

PointF(102f, 101f)

)

val smoothed = smoothTrajectory(noisy)

assertTrue(smoothed[1].x < noisy[1].x) // Spike reduced

}

@Test

fun velocity calculation is correct() {

val coords = listOf(PointF(0f, 0f), PointF(100f, 0f))

val timestamps = listOf(0L, 1000L) // 1 second apart

val velocities = calculateVelocities(coords, timestamps)

assertEquals(100f, velocities[0], 0.1f) // 100 pixels/sec

}

@Test

fun padding extends to 150 points() {

val coords = List(50) { PointF(it.toFloat(), 0f) }

val (features, _, mask) = padOrTruncate(coords, ..., 150)

assertEquals(150, features.size)

assertEquals(50, mask.count { it == 1 }) // 50 real, 100 padded

}

Beam Search Tests:

@Test

fun beam search returns top N candidates() {

val memory = createMockMemory()

val beams = runBeamSearch(memory, beamWidth=8, maxLength=10)

assertTrue(beams.size <= 8)

assertTrue(beams[0].score >= beams[1].score) // Sorted by score

}

@Test

fun beam search terminates on EOS() {

val memory = createMockMemory()

val beams = runBeamSearch(memory, beamWidth=1, maxLength=20)

assertTrue(beams[0].tokens.last() == EOS || beams[0].tokens.size == 20)

}

Tokenization Tests:

@Test

fun decode converts tokens to string() {

val tokens = listOf(SOS, 8, 5, 12, 12, 15, EOS) // "hello"

val decoded = SwipeTokenizer.decode(tokens)

assertEquals("hello", decoded)

}

@Test

fun decode filters special tokens() {

val tokens = listOf(SOS, PAD, 8, UNK, EOS)

val decoded = SwipeTokenizer.decode(tokens)

assertEquals("h", decoded) // Only 'h' remains

}

Integration Tests

End-to-End Pipeline:

@Test

fun full pipeline produces predictions() {

val swipeInput = SwipeInput(

coordinates = createMockSwipe(), // Simulate "hello" swipe

timestamps = createMockTimestamps(),

touchedKeys = emptyList()

)

val predictions = onnxPredictor.predict(swipeInput)

assertTrue(predictions.words.isNotEmpty())

assertTrue(predictions.words[0] in listOf("hello", "hallo", "hell"))

assertTrue(predictions.scores[0] > predictions.scores[1])

}

Manual Testing Checklist


6. Success Criteria

Functional Success

Technical Success

User Experience Success


7. References

Documentation

Source Files

Models

Bug Reports

| Bug # | Description | Status |

|-------|-------------|--------|

| #257 | LanguageDetector missing | ✅ FIXED |

| #259 | NgramModel missing | ✅ FIXED |

| #262 | WordPredictor integration | ✅ ARCHITECTURAL (works alongside ONNX) |

| #263 | UserAdaptationManager missing | ✅ FIXED |

| #270 | Time delta calculation | ⚠️ Needs verification |

| #271 | Consecutive duplicates filter | ⚠️ Needs verification |

| #273 | Training data persistence | ✅ FIXED (SQLite) |

| #274 | ML training external | ✅ ARCHITECTURAL (by design) |

| #275 | Async prediction | ✅ ARCHITECTURAL (coroutines) |

| #276 | Trace analyzer | ✅ ARCHITECTURAL (neural features) |

| #277 | Multi-language expansion | ⚠️ Framework ready, needs assets |


8. Notes

Why ONNX Over CGR

Implementation Complexity

Future Enhancements


Last Updated: 2025-12-04

Status: ✅ Core complete, all P0 bugs resolved

Priority: P1-P2 remaining (time delta calculation, duplicate filtering, multi-language expansion)