Updates

Changelog

New features, improvements, and fixes for every release of LOTA.

1.2.8

May 27, 2026

Changed

Middle-page Point Cloud mode shows the live RGB camera — no on-screen point overlay. The LiDAR compute kernel still runs and PLY streaming is unchanged; geometry is meant to be visualized receiver-side. The 60-slot ring buffer (~70 MB GPU memory) is gone
Gaussian Capture Rate picker is now 30 / 15 / ~7 Hz (was 60 / 30 / 15). iPhone LiDAR runs at 30 Hz; the old "60 Hz" option processed duplicate ARFrames for no extra coverage. GaussianRecorder.recordFrame filters duplicate-buffer ARFrames before striding, so the picker now reflects unique LiDAR scans/sec. Stored 60 values migrate to 30 on load
Backpressure on the PLY TCP transport — TCPTransport.sendRaw (PLY only) gates on a ~1.5 MB inflight-bytes counter and drops new frames when the receiver lags rather than queueing them. Phone-side memory and CPU stay flat under a stall; the receiver gets fresh frames when it recovers instead of a backlog. Also enables noDelay on the PLY socket. Video TCP transport unaffected

Removed

Frame Window slider in Point Cloud Settings — drove the now-deleted ring buffer; nothing downstream consumed it after the camera-only middle page change. Setting + UserDefaults key + UI row + renderer wiring all deleted
Dead GPU point-accumulation API in `MetalRenderer` — startAccumulation / stopAccumulation / appendAccumulationPoints and the matching GaussianRecorder.onPointsExtracted callback had no callers anywhere; ~100 lines of unreachable code removed. CPU-side accumulation in GaussianRecorder.allPositions (the actual driver of Gaussian / COLMAP / Nerfstudio / PLY exports) is untouched

Fixed

Compute Quality's Balanced and Efficient options were silently dropping all or most PLY frames — drawPointCloud incremented its frame counter on every ARFrame but gated compute on depthChanged && (counter % skip == 0). At skip=2 (Balanced, the default), the % 2 == 0 phase landed on duplicate-buffer ARFrames where depthChanged was false, so the two filters never both passed and compute never ran. Full worked by luck (% 1 is always true). Fixed by moving the counter increment inside the if depthChanged branch so the stride counts unique LiDAR scans. Users on Balanced (the default) were effectively getting no PLY stream
Receiver IP field accepts commas as periods — locales that use comma as the decimal separator (de_DE, fr_FR, es_ES, most of continental Europe) show a comma key on iOS's .decimalPad where US users see a period. The field now intercepts the comma via onChange and replaces it before it lands in receiverHost. Invisible to US users; immediate fix for users on comma-locale keyboards who were silently getting bogus IPs. Also handles paste-from-clipboard with comma-separated addresses

Now on the App Store

LOTA 1.2.7 is the first release available on the App Store.

1.2.7

May 16, 2026

Added

Gaussian splat viewing in the PLY Viewer — the PLY Viewer now recognizes 3D Gaussian Splat .ply files (the layout OpenSplat and other 3DGS trainers write) and renders them as splats instead of flat points. A splat is detected from the header by its splat-specific properties (scale_0, rot_0, f_dc_0, opacity); a file without them still loads on the existing point-cloud path. Capture/PLYReader.swift parses the splat properties off the main thread, applies the standard activations (exponentiate the log-scale, sigmoid the opacity, SH degree-0 basis for color, normalize the rotation quaternion), precomputes each Gaussian's 3D covariance, and returns a new LoadedScene enum so callers route point clouds and splats separately
Metal Gaussian splat renderer — new Rendering/SplatRenderer.swift + Rendering/SplatShaders.metal. EWA splatting: each Gaussian's precomputed 3D covariance is projected to a 2D screen-space ellipse through the perspective Jacobian and drawn as an instanced, alpha-blended quad. A GPU bitonic sort orders the Gaussians back-to-front every frame the camera moves (a static view skips the sort, so an idle viewer costs nothing). v1 renders the degree-0 (DC) color; capped at 2,000,000 Gaussians with uniform sub-sampling
Splat Size control — when the loaded file is a splat, the PLY Viewer Settings sheet replaces the Color section and Point Size slider with a single Splat Size slider (0.25×–3.0×, default 1.0×) that scales every Gaussian. Persisted in Models/PLYViewerSettings.swift
ASCII PLY export — the Point Cloud export format gains a PLY Format toggle (binary / ASCII) in the Capture Settings sheet, shown only when Point Cloud is selected. ASCII writes a human-readable text PLY — one x y z r g b line per point, positions at 6-decimal precision — easier to inspect in an editor or parse in a custom script, roughly 3× larger before zipping. Capture/COLMAPExporter.swift gains an ascii parameter and a sibling text writer. ASCII artifacts are named LOTA_<timestamp>_PLY_ASCII.zip so the encoding is clear from the filename; the encoding is snapshotted at startRecording so toggling mid-session cannot desync the zip suffix from the file

Changed

PLY Viewer adapts to the loaded file kind — the info chip shows a splat count for splats (point count for clouds), the settings sheet branches between the point-cloud controls and the splat control, and the neon axis-orientation gizmo now appears for splats as well as point clouds
`MetalRenderer` gained a splat draw branch — draw(in:) routes to the SplatRenderer when a splat is loaded, parallel to the existing staticCloudActive static-cloud path, so the point-cloud render path is unchanged. Rendering/ViewerCamera.swift now exposes its view and projection matrices separately (the splat EWA Jacobian needs camera space, not just the combined view-projection)

1.2.6

May 13, 2026

Added

PLY Viewer export mode (right page) — new read-only entry in the Gaussian Capture format dropdown for inspecting .ply point clouds without leaving LOTA. Selecting it pauses the ARSession (saves battery and thermals) and arms the renderer's static-cloud branch. The bottom shutter morphs into a folder glyph that opens .fileImporter filtered to a new dev.lota.ply UTI. Capture/PLYReader.swift streams the binary LE payload off the main thread, builds two SIMD3<Float> buffers, computes the AABB, and caps at 5 M points with uniform sub-sampling
Orbit camera with pan, zoom, and snap — Rendering/ViewerCamera.swift (spherical yaw/pitch/distance/target/pan), fit-bounds reset at 15° pitch. Gestures go through a UIKit overlay (Views/PLYViewerGestureLayer.swift) so single-finger drag (orbit), two-finger drag (pan), and pinch (zoom) can be disambiguated by touch count, which SwiftUI's DragGesture cannot do. Double-tap resets to fit-bounds. The renderer reads the view-projection matrix once per frame via a nonisolated(unsafe) snapshot pushed from a TimelineView tick
Static point cloud shader path — staticPointCloudVertex in Shaders.metal plus a sibling pipeline that reuses pointCloudFragment. draw(in:) short-circuits to drawStaticCloud when staticCloudActive is true, skipping ARKit passes and clearing to near-black. The shader applies a model matrix (up-axis swap + mirror flips) before view-projection and branches color by a uniform mode index, sampling a 256-texel 1D LUT for scalar palettes
Five viewer color modes — Original (per-point RGB), Solid (ColorPicker), By Height (Y coordinate), By Distance (length from camera origin, live as you orbit), By Axis (signed distance along chosen X/Y/Z). Scalar modes share the nine DepthColorMapStyle palettes. Original on a colorless file falls back to white
Neon axis orientation gizmo (top-trailing overlay in viewer mode) — six axis caps projected from the live camera basis, X/Y/Z letters on positive ends and hollow rings on negative ends (Blender convention). Tap any cap to spring-snap the camera via a shared ViewerCamera.snap(toAxis:sign:) that the Top/Front/Side preset picker also routes through. Mirror toggles do not affect the gizmo so the user keeps a stable world-axis reference after flipping handedness
Cloud info chip — centered top chip while loaded: point count (thousand-separated), AABB extents in metres or centimetres, file size on disk, plus subsampled (yellow) and no RGB badges. Same RoundedRectangle(8) + .ultraThinMaterial treatment as the live capture counter row
PLY Viewer Settings sheet — Views/PLYViewerSettingsSheet.swift: Color (mode + palette + axis), Orientation (camera mode + optional spin speed + Y-up/Z-up + per-axis mirrors + Reset View), Point Size (1.0–12.0 slider). Persisted via Models/PLYViewerSettings.swift (@Published + UserDefaults)
Capture Rate setting for the recording formats (COLMAP / Nerfstudio / Nerfstudio + Depth / Point Cloud PLY) — 60 / 30 / 15 Hz segmented picker that throttles GaussianRecorder.recordFrame. Drops every Nth ARFrame before extractPoints, so the per-pixel CPU work and JPEG encode are skipped too. Roughly halves point count and zip size at 30 Hz, quarters at 15 Hz. Default 60 preserves pre-1.2.6 behavior. Lives in a new Views/GaussianCaptureSettingsSheet.swift reached via a Capture settings button that replaces the previous "No settings for this mode" placeholder
`dev.lota.ply` UTType declaration in Info.plist (UTImportedTypeDeclarations) — imported type conforming to public.data with the ply extension. Required for .fileImporter to filter cleanly across Files, iCloud Drive, and third-party providers; the system has no built-in UTI for PLY

Changed

3D-scene recording formats now have a settings cog — was a dimmed "No settings for this mode" pill. Same ModeSettingsButton shell as Material / IMU / Audio, opens the new Capture Settings sheet
`ExportFormat` extended with `.plyViewer` + `isViewer: Bool` — sits alongside the existing isSingleShot and isTimeSeries flags. needsImages and needsPoses now also gate on !isViewer so accidentally routing a viewer format through the keyframe / JPEG paths is a compile error rather than a runtime no-op

1.2.5

May 7, 2026

Fixed

iPad rotation no longer reverts the active capture mode to Color — on iPad, picking a non-camera capture mode (Motion, Audio, Transcription) on the middle page, rotating to landscape, then rotating back to portrait would silently snap the mode back to Color. iPhone was unaffected. Root cause: PageTabViewStyle's underlying UIPageViewController is more aggressive about discarding cached off-screen pages on iPad than on iPhone, so the right page (GaussianCaptureView) was being remounted on the landscape→portrait transition. Its .onAppear re-fired and called applyCaptureModeForFormat(.colmap) — the default export format — which falls into the default: branch of the helper and runs setMode(.color), clobbering whatever mode the user had selected on the middle page. The author's existing comment ("onAppear doesn't re-fire in TabView's .page style") documented an assumption that holds on iPhone but breaks on iPad. Fixed by gating the offending .onAppear with guard selectedPage == 1 else { return } so the format-restore logic only runs when the right page is actually the active one. The legitimate first-entry-to-right-page path remains covered by .onChange(of: selectedPage) directly below it. Verified end-to-end with instrumented logging on a physical iPad before applying the fix

1.2.4

May 5, 2026

Added

NDI SDK license artifacts — Vendor/NDI/NDI SDK License Agreement.pdf (v1.1, 15 August 2025) and Vendor/NDI/README.md documenting SDK version (v6 Apple), download steps, Xcode wiring, and license obligations. Pins the terms LOTA ships under and gives a fresh-clone path to set up the now-untracked static lib
Acknowledgements section in Transmission Settings — new section below Help → Replay Tutorial showing NDI® is a registered trademark of Vizrt NDI AB. and a link to ndi.video. Satisfies the NDI SDK license attribution requirement for shipping apps
Dependencies section in `README.md` — table of external components (NDI SDK, ZIPFoundation, Depth Anything V2) with versions, sources, distribution, and licenses. Getting Started expanded with the NDI SDK install/copy step now required on a fresh clone
`APP-STORE-DESCRIPTION.md` — promotional copy ready to drop into App Store Connect

Changed

`Vendor/NDI/lib/libndi_ios.a` gitignored — the 256 MB static lib is no longer tracked. Per NDI license: don't redistribute the SDK; per repo hygiene: don't commit large binaries. Local Archive builds work fine since the file stays on the developer's disk; Xcode Cloud builds will need a ci_post_clone.sh follow-up to download and stage the SDK before building

1.2.3

April 30, 2026

Added

IMU Trace export mode (right page) — records device motion sensors (acceleration, gyroscope, compass heading, barometric pressure / relative altitude) over time. The right page renders the same scrolling lane visualization as the middle page's Motion mode while the trace is being collected. Stop produces a ZIP containing a CSV/TSV file (user-selectable format), manifest.json describing every column (name, unit, dtype, description), and README.txt — saved to the configured iCloud export folder
Audio Trace export mode (right page) — records audio analysis frames (bass/mid/high levels, drum onsets, dynamics, 20-band FFT spectrum) over time using the same mic engine as the middle page's Audio mode. Same ZIP layout as IMU Trace
PDF report companion for both trace modes — multi-page dark-theme report with cover (device + recording metadata), per-channel time-series line charts, derived |A| accelerometer magnitude, FFT spectrogram heatmap (rendered directly as a CGImage for pixel-perfect output), beat-events timeline, compass-heading polar rose (vector), and a per-channel statistics table. Toggle "Include PDF Report" in the trace mode's settings sheet to bundle alongside the CSV/TSV
Auto-stop duration cap for traces — toggle in the trace mode's settings sheet plus a 5–600 s slider. Recording auto-stops, exports, and surfaces the save-summary sheet when the cap is reached. Off by default; recording continues until manually stopped
Local recording to Camera Roll (middle page) — records the live composited Metal output as an H.264 .mov and saves to the user's Photos library via PHPhotoLibrary add-only access. Available in every camera-using mode (Color, Mono, Depth, Neural Depth, Point Cloud, Blob Track); disabled in Transcription / Motion / Audio (no camera content). 4-hour hard cap
STREAM | RECORD paradigm switch (middle page) — iOS Camera-style segmented control under the shutter button. Stream and Record are mutually exclusive: the segmented control locks while either is active so a recording can't start mid-stream and vice versa. The same shutter button drives whichever paradigm is selected
Unified iOS Camera-style bottom bar across all three pages — same 70pt shutter button at the same vertical position on every page. Right page is single-paradigm (capture-to-file, no pill); left page is single-paradigm (Stream, single highlighted pill); middle page has the full STREAM | RECORD segmented. Swiping between pages keeps the primary action button locked in place
Bottom-bar pinning to the device's physical bottom edge — the shutter + pill island stays at the same physical position on the device regardless of orientation, matching iOS Camera's shutter behavior. Rotate to landscape and the island slides to the screen edge that now corresponds to the device's bottom (where it was in portrait), counter-rotating as one unit. Other UI (mode picker, status bar, EndpointSummary, demo / folder picker side controls) continues to rotate naturally with the interface
Photos (add-only) permission added to the first-launch onboarding flow alongside the existing six. Requested via PHPhotoLibrary.requestAuthorization(for: .addOnly) so iOS surfaces the narrower add-only prompt; LOTA never reads existing photos. New NSPhotoLibraryAddUsageDescription in Info.plist
Marketing device names in trace exports — sysctl hardware IDs (e.g. iPhone18,2) are now mapped to consumer names (iPhone 17 Pro Max) in trace manifest.json and the PDF report header. Lookup table in LOTA/Utilities/DeviceModel.swift covers iPhone 8 through iPhone 17; unknown IDs fall through to the raw sysctl string
No-data placeholder for trace PDF charts — when a chart has no data to plot (for example, an Audio Trace that detected zero beats), the page now shows a centered "No data captured" message + an explanation, instead of rendering empty axes. Lets the recipient distinguish "the recording captured nothing for this signal" from "the chart is broken"
Mode picker chevron — the picker trigger's secondary glyph is now a chevron that flips down/up based on whether the panel is open. Replaces the previous 3×2 dot grid

Changed

PLY point cloud streaming is binary-only — the CSV-text variant, its Settings toggle, and the PLYEncoder.encode function are removed. Every PLY frame on the wire is now UInt32 LE count + N × (3 Float32 + 3 UInt8). Receivers using the `LOTABinaryPLYReceiverV2.tox` companion component on the LOTA docs page need no changes; legacy CSV receivers will need to switch to the binary parser
Every export ZIP now ends with `_<MODE>` (LOTA_<timestamp>_COLMAP.zip, _Material.zip, _IMU.zip, etc.). Single source of truth via ExportFormat.fileSuffix; applies to existing Gaussian / Material exports as well as the new trace modes

1.2.2

April 29, 2026

Added

Neural Depth capture mode — new .neuralDepth case in CaptureMode. Runs Depth Anything V2 Small (ByteDance Research, packaged for Core ML by Apple, Apache 2.0, 24.8 M params) on the Apple Neural Engine to estimate depth from the regular camera feed. Works on every iPhone, no LiDAR required. DepthAnythingEngine (LOTA/ML/DepthAnythingEngine.swift) wraps MLModel + MLDictionaryFeatureProvider, calls back via onInferenceComplete, and feeds an AI-depth Metal texture into the existing depth render path (drawAIDepth + new renderMode == 4 branch in Shaders.metal's depthFragment). LiDAR-equipped phones can still pick Neural Depth as a comparison source. Selectable from the mode picker with a brain.head.profile icon between Depth and Point Cloud
Neural Depth loading overlay — selecting Neural Depth shows a centered glass card explaining the model is initializing and the first frame may take a moment. Driven by captureManager.neuralDepthLoading; auto-fades on first inference completion via .animation(.easeOut(duration: 0.2))
NDI side-by-side falls back to AI depth on non-LiDAR phones — the side-by-side toggle previously did nothing on phones without LiDAR (no depth source to composite). Now when enabled on a non-LiDAR device, the right half is filled with Depth Anything V2 estimated depth instead, so the multi-pane NDI workflow works on every iPhone. Toggle in TransmissionSettingsView is also disabled for modes where side-by-side has no meaning (e.g. transcription/motion/audio) with a one-line caption explaining why
Per-device NDI label — StreamingSettings.deviceLabel (@AppStorage key device.label), initialized to a random 4-character base36 ID (e.g. 7F3A) on first launch and persisted forever. Used to build the broadcast name as LOTA - <deviceLabel> (default looks like LOTA - 7F3A, customized like LOTA - Stage Left). Editable in the Receiver section of Transmission Settings with a 20-character cap, whitespace trim, and fallback to the random default if cleared. Multi-device venues can now distinguish each phone's NDI source in TouchDesigner / OBS / NDI Studio Monitor
Live NDI label restart — CaptureManager adds a debounced (0.3 s) Combine sink on streamingSettings.$deviceLabel. When the label changes mid-stream, StreamingService.restartNDI(sourceName:) tears down and recreates the NDI sender so the new name surfaces in receivers within ~1 s with no user action
Device Label first-launch pulse — the Device Label field pulses with a subtle yellow border + scale animation the first time the user opens Transmission Settings, until they tap into the field. Tracked via @AppStorage("lota.hasSeenDeviceLabel"); once true the pulse never fires again
NetworkInfo helper + observer (LOTA/Utilities/NetworkInfo.swift) — NetworkInfo.currentIPv4Addresses() wraps getifaddrs / freeifaddrs and returns active IPv4 interfaces with friendly labels (en0/en1 → "Wi-Fi", bridge100/bridge101 → "USB Hotspot", en2–en9 → "Ethernet" for USB-Ethernet adapters, others returned raw). Skips loopback, IPv6, and link-local. NetworkInfoObserver is an ObservableObject that owns an NWPathMonitor, publishes the address list on path changes, refreshes on init, and cancels the monitor on deinit
This Device section in Transmission Settings — pinned to the top of the form (above Receiver), lists the iPhone's currently active IPv4 addresses with friendly interface labels. Tap any row to copy the address (UIPasteboard.general.string) with a haptic + brief "Copied" chip. Live updates via the same NetworkInfoObserver so the list refreshes as the user toggles Personal Hotspot, plugs in / unplugs USB-Ethernet, or joins / leaves Wi-Fi without re-opening the sheet. Helper text clarifies these are the iPhone's own addresses, not values to paste into the Receiver IP field — they're meant to help the operator set the correct receiver IP on their computer
EndpointSummary redesigned as a centered table — was a left-aligned monospaced caption block. Now a Grid with three centered columns (icon · protocol · host:port/value) sized to its content and horizontally centered in the page. Same dim 55 %-white treatment, same trigger conditions (only renders when streaming and at least one protocol is enabled). New EndpointDescriptions struct centralizes the per-mode protocol sub-captions
NDI source name surfaced in EndpointSummary — the NDI row's value column now shows LOTA - <deviceLabel> instead of "Broadcasting", so the operator can confirm the broadcast name without leaving the live view. Updates live when the label changes
Device addresses footer in EndpointSummary — new compact line below the protocol rows: This iPhone: 192.168.1.42 (Wi-Fi) · 172.20.10.1 (USB Hotspot). Hidden when no addresses are available (airplane mode). Subscribes to the same NetworkInfoObserver so it stays current
Wired-vs-wireless transmit-button icon — when the configured Receiver IP is on the same subnet as the phone's USB Hotspot interface (bridge100/bridge101), the transmit button uses personalhotspot.circle / .fill instead of antenna.radiowaves.left.and.right.circle / .fill. Visual confirmation that the operator's wired direct-connect setup is correctly targeted. Switches automatically as Receiver IP or interfaces change. Helper NetworkInfo.isWiredDestination(_:using:) does the subnet match
Local Network permission requested upfront — first-launch PermissionsBootstrapper now triggers the Local Network dialog by briefly starting an NWBrowser for _lota._udp after the OSC/NDI/PLY transports are configured. Avoids the mid-shoot "LOTA would like to find devices on your local network" surprise the first time the user taps Transmit
Motion & Fitness permission requested upfront — PermissionsBootstrapper briefly starts CMMotionManager.startDeviceMotionUpdates(to:) to trigger the system Motion & Fitness dialog at first launch. Previously the dialog appeared the first time the user entered Motion mode (or, more confusingly, the first time CMAltimeter was instantiated, which could surprise users who hadn't touched motion features yet)
PermissionsIntroView now lists six permissions — Camera, Microphone, Speech Recognition, Local Network, Location, and Motion & Fitness, each with an icon + one-line purpose. Same upfront grant flow; six dialogs fire in sequence after Continue
Tutorial endpointSummary step — new step added between Status Bar and Performance Note that walks the user through the bottom endpoint summary block (NDI source name + device IPs), so the surface is discoverable. Brings the guided tour to 10 steps. New anchor in TutorialAnchor; EndpointSummary exposes a forceShow parameter so the block stays visible during this step even when streaming is off
No-settings placeholder pill — Color and Mono on the middle page, and non-Material export formats on the right page, now show a dimmed No settings for this mode pill in the same slot as the active mode-settings button. Keeps the vertical layout stable as the user cycles modes and signals where settings would live. New ModeSettingsPlaceholder view sibling of ModeSettingsButton

Changed

Mode settings button background darkened for outdoor readability — ModeSettingsButton's translucent white background (.white.opacity(0.10)) replaced with semi-opaque black (.black.opacity(0.45)) plus a hairline white border (.white.opacity(0.10)) and brighter text (.white.opacity(0.9)). Same glass-pill silhouette but readable against bright outdoor scenes
Mode picker panel width — landscape fix — ModePickerPanel's LazyVGrid previously expanded to fill its container, so in landscape the glass background slab stretched almost the full screen width even though the cells stayed clustered. Wrapped the grid in an HStack(spacing: 0) { Spacer(); LazyVGrid().fixedSize(horizontal: true, vertical: false); Spacer() } so the panel hugs its intrinsic content width (~280 pt for 3 columns) and stays centered in both orientations
All per-mode + tracking-mode settings sheets now use `.presentationDetents([.medium, .large])` with `.presentationDragIndicator(.visible)` — was full-screen sheets (default) on every page except Material Settings, which gave the per-mode sheets a flat-black look while Material had the glass-over-camera feel. Applied to DepthSettingsView, NeuralDepthSettingsView, PointCloudSettingsView, BlobTrackingSettingsView, TranscriptionModeSettingsView, MotionSettingsView, AudioSettingsView, BodyTrackingSettingsView, FaceTrackingSettingsView, HandsTrackingSettingsView, and the shared TransmissionSettingsView. Single visual family across the app: drag indicator, partial sheet at medium, swipe to expand
NDI default broadcast name — was hardcoded LOTA (iPhone) (NDITransport.swift:36). NDITransport.start(_:) now takes a sourceName: String parameter, set by StreamingService from streamingSettings.deviceLabel. Default LOTA - <4-char ID> distinguishes multiple LOTA devices on the same network without manual configuration
Em-dashes replaced with colons across user-facing onboarding and tour copy — applied to PermissionsIntroView, the privacy card, and tutorial step captions per design preference
App logo cropping — the intro screen's logo image is clipped to a RoundedRectangle(cornerRadius: 21, style: .continuous) so a square LotaLogo asset displays as a rounded-rect "squircle" matching iOS app-icon shape
Tutorial Skip button vertical alignment — was misaligned with the transmit button across the bottom bar. Now mirrored to the same baseline so Skip on the bottom-leading and Transmit on the bottom-trailing share a horizontal axis
EndpointSummary description map — descriptions for each protocol row are mode-aware via EndpointDescriptions.forCaptureMode(_:) and forTrackingMode(_:), so the same row template renders the correct sub-caption depending on what the operator is actually streaming (e.g. NDI says "Camera + LiDAR depth side-by-side" in depth mode but "Camera + AI depth side-by-side" on non-LiDAR phones in Neural Depth mode)
`StreamingService.start(settings:)` builds the NDI source name once at start time as "LOTA - \(settings.deviceLabel)" and threads it into ndiTransport.start(sourceName:). New restartNDI(sourceName:) helper does the stop+start dance for live label changes without restarting the entire streaming pipeline
`FrameEncoder` `.neuralDepth` case — skips raw depth wire encoding (the AI-estimated depth is rendered locally and surfaced via NDI side-by-side; no per-pixel float depth is transmitted over TCP/UDP for this mode)

Fixed

Side-by-Side toggle wired into capability check — was always tappable but silently no-op on modes where it has no meaning. Now disabled with a one-line caption describing why per current mode (no depth source, AI-fallback in use, etc.)

1.2.1

April 29, 2026

Added

First-launch onboarding — PermissionsIntroView shows on first cold launch with an app logo, brief copy, four per-permission rows (camera, microphone, speech recognition, location), a green privacy reassurance card explaining LOTA doesn't track or store data, and a Continue button that fires the four OS permission dialogs in sequence. Apple HIG calls for a pre-prompt explainer before the OS dialog cascade — higher grant rates and clearer expectations than firing them cold. Skipped on subsequent launches via @AppStorage("lota.hasRequestedPermissionsOnce")
Custom logo asset slot — drop a LotaLogo image set into Assets.xcassets/LotaLogo.imageset/ and the intro screen displays it clipped to a RoundedRectangle(cornerRadius: 21, style: .continuous) (Apple's app-icon "squircle" superellipse). Falls back to an SF Symbol if the asset isn't present
Background SpeechTranscriber asset prefetch — once speech permission is granted, PermissionsBootstrapper.prefetchSpeechModel() fires a detached Task that runs AssetInventory.assetInstallationRequest(supporting: [transcriber]) and downloadAndInstall() for the user's locale. By the time the user opens transcription mode the model is on disk; the previous flow downloaded it lazily on first transcription-mode entry, which delayed first captions and required network at that exact moment. TranscriptionEngine.startAudioEngineAndTranscriber keeps its lazy install path as a fallback
9-step guided tour — TutorialController + TutorialOverlay + TutorialStep. Hybrid welcome card → spotlight tour: starts with a centered glass card on the middle page, then dims the live UI and highlights actual elements with a yellow outline + caption pill that flips above/below depending on anchor position. Tap-anywhere advances; Skip button bottom-leading mirrors the transmit button position. Programmatically swipes between pages and switches modes (e.g. enters depth mode for the per-mode-settings step) so the user sees the real UI in context. Auto-runs after the permissions intro on first launch
Tour steps: welcome card → mode picker → mode settings (auto-switches to depth so the button appears) → status bar (transmissions explanation) → performance note about enabling all four protocols → swipe-left ARKit page → swipe-right Capture page → transmit button → end card
Replay Tutorial button — new Section("Help") at the bottom of TransmissionSettingsView with a Replay Tutorial button. Dismisses the sheet, waits 350 ms for the dismiss animation, then calls tutorial.start(). Wired through @EnvironmentObject TutorialController
TutorialAnchor preference key (Onboarding/TutorialAnchor.swift) — enum TutorialAnchor: Hashable { .modePicker, .modeSettings, .statusBar, .transmitButton } plus a .tutorialAnchor(_:) view modifier that publishes the view's global frame via TutorialAnchorPreference: PreferenceKey. The overlay reads anchor frames via .onPreferenceChange set on the root TabView in ContentView. Anchors are attached to the picker, status pill, mode-settings button, and transmit button on CameraView and ARKitTrackingView; GaussianCaptureView carries the picker anchor for future extension
Per-mode settings buttons — every mode that has mode-specific settings shows a glass <Mode> Settings pill (sliders icon + label) just under the status bar. Middle page: Depth, Point Cloud, Blob Track, Transcription, Motion, Audio. ARKit page: Body, Face, Hands. Color and Mono have no per-mode settings, so no button is shown. Driven by CaptureMode.settingsKind / TrackingMode.settingsKind and presented via a single .sheet(item: $activeModeSettings) per page
Eight new focused settings sheets in Onboarding/ModeSpecificSettings.swift — DepthSettingsView, PointCloudSettingsView, BlobTrackingSettingsView, TranscriptionModeSettingsView, MotionSettingsView, AudioSettingsView, BodyTrackingSettingsView, FaceTrackingSettingsView, HandsTrackingSettingsView. Each contains only the section(s) relevant to that mode, lifted from the old monolithic Settings sheet
TransmissionSettingsView — new focused sheet with only Receiver, Transport, NDI, OSC, Point Cloud Stream, and Protocol Info sections. Surfaced by tapping the floating status bar pill on the middle or left page (entire pill is now a button via the new onTap: parameter on StatusOverlayView). Lets the operator verify or adjust receiver IP and per-protocol toggles without wading through the full settings tree
EndpointSummary view (Views/EndpointSummary.swift) — small dim monospaced caption block at the bottom of the middle and left pages showing one row per active transmission (OSC 192.168.1.100:8000, TCP …:9847, PLY …:9848, NDI Broadcasting). Renders only when streaming is on and at least one of the four protocols is enabled, so the operator can confirm at a glance their settings are pointed at the right machine. Hidden completely when streaming is off
TransmissionChips (Views/TransmissionChips.swift) — reusable per-protocol chip strip embedded inside StatusOverlayView (middle page) and trackingStatusView (left page). One chip per enabled transmission, color-coded: dim white when configured-but-idle, statusLive (red) when actively streaming, statusError (orange) when streaming with a connection problem. The PLY chip is hidden on the ARKit page because tracking-mode forces .color and the renderer never enters the point-cloud compute path that fires onPointCloudFrame
Custom dot-grid mode picker (Views/ModePicker.swift) — ModePickerTrigger is a compact glass capsule containing the active mode's SF Symbol next to a square.grid.3x2.fill glyph, mirroring the iOS Camera app's "more controls" affordance. Tap opens ModePickerPanel, a glass-material rounded panel with a LazyVGrid of circular icon buttons (yellow when active, dimmed when disabled, e.g. LiDAR-required modes on a non-LiDAR device). Same pattern across all three pages: 2×4 grid for the 8 capture modes (middle), 1×3 row for the 3 tracking modes (left), 2×3 grid for the 5 export formats (right). Tap-outside-to-dismiss backdrop. No Menu or .popover involved, so this surface adds zero _UIReparentingView warnings
NSCameraUsageDescription added to Info.plist. The explicit AVCaptureDevice.requestAccess(for: .video) call from PermissionsBootstrapper requires the key (ARKit historically tolerated its absence)

Changed

Status bar redesigned as a floating pill on the middle page (StatusOverlayView) — was a full-width bar with FPS on the left and LiDAR on the right separated by a Spacer(). Now an auto-sized HStack of Label chips with rounded .a11yBackground (matches the right-page counter pill), centered horizontally and growing as more transmission protocols are enabled. Shows FPS, LiDAR status, mode-conditional chips (LISTENING / IMU / AUDIO / blob count), and one chip per enabled transmission protocol
Mode picker positioned consistently across all three pages — the picker is now the first VStack child on every page (was below StatusOverlayView on the middle page). When you swipe between pages the picker stays at the same Y coordinate, which feels seamless
Material Settings button moved up — was inside MaterialCaptureContent (rendered lower in the screen with the plane-status icon and Lock Plane button). Moved to the top of GaussianCaptureView's VStack, just below the status pill and above the "Point at a flat surface…" tip text. Same ModeSettingsButton style as every other mode now uses
Bottom-left gear button removed from middle and left pages. Settings are now reachable via two routes: tap the status bar (transmission protocols) or tap the per-mode settings button (mode-specific settings). The full monolithic SettingsView is unreachable, so its file was deleted
Every existing dropdown replaced with the new dot-grid picker pattern — three Menu-based dropdowns (one per page) removed in favor of ModePickerTrigger + ModePickerPanel. Per-mode caption text below the trigger removed (the active mode's icon in the trigger pill carries that information, and the mode picker's panel cells carry their own labels)
Audio mode lane positions — audio waveforms (BASS/MID/HIGH) had their baselines at lane bottoms with HIGH cut off behind the bottom UI buttons. The graph renderer maps a value to a Y position via the lane's value range; IMU ranges are symmetric (-2..2 for accel) so a still device's 0 lands at lane center, but audio ranges were unipolar (0..1) so silent input (0) landed at lane bottom. Changed all audio ranges (BASS, MID, HIGH, DL, DM, DH, BURST) and the FFT-band fallback from (0, 1) to (-1, 1). Silent baseline now sits at lane center matching IMU; loud peaks still grow upward to lane top; the bottom half of each lane becomes unused headroom (audio is non-negative)
`enterTrackingMode()` now routes through `setMode(.color, haptic: false)` — the renderer state (mono/depth/point cloud/blob/transcription/motion/audio) used to bleed through to the ARKit tracking page and sit underneath the skeleton/face/hand overlays. Worse, mic-driven modes (transcription/motion/audio) kept their engines running across the page swipe. Routing through setMode resets the renderer to color and tears down any auxiliary engine via the existing per-mode lifecycle teardown branches
`setMode` gains an optional `haptic: Bool = true` parameter so internal callers (enterTrackingMode, tutorial mode-switch side effects) can suppress the haptic that fires on explicit dropdown selection
Transmission chip color logic — chips now render dim white at 55% opacity when configured-but-idle and switch to Color.statusLive (red) when streaming starts, or Color.statusError (orange) on connection error. Previously the chips stayed full opacity and a separate red LIVE / statusText chip appeared at the end of the bar — duplicate signal. The standalone live chip and streamingStatusLabel helper were removed
`ContentView` now hosts the tutorial overlay and permissions cover — @StateObject TutorialController, @AppStorage flags for hasRequestedPermissionsOnce and hasSeenTutorial, @State anchorFrames populated via .onPreferenceChange(TutorialAnchorPreference.self). .fullScreenCover for the permissions intro on cold launch, .overlay { TutorialOverlay(...) } driven by tutorial.isActive, .environmentObject(tutorial) so TransmissionSettingsView can find it. .onChange(of: tutorial.isActive) { wasActive, isActive in } flips hasSeenTutorial to true the first time the tour dismisses (skip or completion both qualify; force-quitting mid-tour leaves the flag false so the next cold launch re-presents it)
`TutorialController.stop()` resets to `.color` mode on dismissal so the user lands somewhere predictable; the mode-settings tutorial step (which auto-switches to depth) doesn't leave the app in an unexpected state when skipped

Fixed

Renderer flicker on dropped depth frames — Depth, Point Cloud, and Blob Track modes flickered back to color whenever ARKit dropped a frame's .sceneDepth (common for the first 1–10 frames after any arSession.run(config) reconfigure, and occasionally under thermal/CPU pressure). The four drawDepth / drawPointCloud (×2) / drawBlobTracking paths in MetalRenderer had drawCamera fallbacks that visibly flashed the color feed in place of the expected visualization. Replaced with a plain return — the previously presented frame stays on screen for that tick. At 60 Hz a single skipped frame is invisible; no risk of leaking drawables since Metal handles unpresented drawables on its own. Also removed a redundant renderer?.captureMode = .color from the .face branch of reconfigureSession — enterTrackingMode's new setMode(.color) call covers it
PLY transmission visibly suppressed on the ARKit page — the PLY chip is hidden in the status bar (TransmissionChips takes a showPLY: Bool flag) and the PLY row is hidden in the bottom endpoint summary (plyStreamEnabled: false passed in from the left-page call site). PLY data was already not transmitted on the tracking page (since setMode(.color) keeps the renderer out of the point-cloud compute path that fires onPointCloudFrame), but the UI implied it was

Removed

`Views/SettingsView.swift` — deleted. The old monolithic Settings sheet is replaced by TransmissionSettingsView (transmission protocols only, opened by tapping the status bar) plus per-mode focused sheets (DepthSettingsView, PointCloudSettingsView, BlobTrackingSettingsView, TranscriptionModeSettingsView, MotionSettingsView, AudioSettingsView, BodyTrackingSettingsView, FaceTrackingSettingsView, HandsTrackingSettingsView, opened via the per-mode ModeSettingsButton). The bottom-left gear button on the middle and left pages goes away with it; the bottom-bar HStack on those pages is now just the streaming/transmit button on the right (and the small tracking-mode label centered on the left page)

1.2.0

April 23, 2026

Added

Material Capture mode: New .material option in the Gaussian Capture page's format dropdown (sits alongside COLMAP, Nerfstudio, Nerfstudio + Depth, Point Cloud). Captures a flat surface from one flash photo pair + LiDAR depth, runs through Apple's GPU compute pipeline, and exports a complete PBR material set as a single ZIP — ready to drop into Substance Designer/Painter, Blender, Unreal, Unity, TouchDesigner, or any PBR-aware DCC. Requires LiDAR
PBR map set in the export ZIP (5 PNGs + manifest):
Five Metal compute kernels in `Shaders.metal` (~210 lines): delightFlashPair (basecolor + intermediate specular), roughnessFromSpecular (local sharpness heuristic), normalFromDepth (central-difference depth gradient), aoBake (horizon-based AO), pbrSpherePreview (Cook-Torrance BRDF sphere via screen-space ray-sphere intersection). Plus orthoSampleHomography for orthorectifying the framing patch and a jointBilateralUpsample helper. All dispatched in a single command buffer (~220 ms total at 1024² with 64 AO samples on iPhone 15/16 Pro)
Closed-form 4-point homography solver — Homography.fromUnitSquare(toQuad:) in PBRMapBaker.swift projects the framing rect's 4 corners into image pixel space and solves the 3×3 mapping in pure Swift, no Accelerate dependency
Plane lock UX: ARKit raycast detects horizontal and vertical planes; tap Lock Plane to snap to the candidate. Auto-installs a 20 cm screen-aligned square patch centered on the camera's look-point (raycast camera-forward → plane intersection), oriented so output "right" = camera's right and output "top" = away from camera
Material Settings sheet (gear button on the capture page) — output resolution (512/1024/2048), AO sample count (32/64/128), normal convention (OpenGL/DirectX), delight strength slider, roughness scale slider; all persisted via UserDefaults
Material Save Summary sheet — embedded PBR sphere preview on a clear background, metallic toggle (manifest-only in v1.2), file-size estimate, Save button with a phase machine (configuring → saving → saved/failed), 1.5 s auto-dismiss after successful save
Autofocus locked during the flash-pair grab — AVCaptureDevice.focusMode = .locked before the flash-off frame, restored to .continuousAutoFocus after the flash-on frame, so the two grabs share the same focal distance and the delight + roughness signals stay registered
Haptic feedback on shutter tap — UIImpactFeedbackGenerator(style: .medium) so the capture feels acknowledged before the bake spinner appears
`MaterialRecorder.swift` — state machine (searching → detected → locked → framed → capturing → baking), torch sequencing, depth+pose snapshot at the flash-on instant, gyro drift check via quaternion difference between the two camera poses
`PBRMapBaker.swift` — nonisolated orchestrator that runs the full bake off the main actor (~220 ms at 1024² / 64 AO). Polling-based command-buffer completion (5 ms cb.status poll) since addCompletedHandler was unreliable in this environment
`PBRMapExporter.swift` — encodes 6 PNGs (preserving 16-bit depth for the height map), builds the JSON manifest with device + capture metadata, bundles into ZIP via ZIPFoundation
`PBRSphereView.swift` — lightweight Image-based sphere preview that composites onto whatever background it's embedded in (clear by design, so the summary sheet's background shows through)
`MaterialState.swift` — transient ObservableObject sibling to GaussianRecorder. Owned by CaptureManager, observed by GaussianCaptureView, so plane status / framing rect / baked maps don't leak between MaterialRecorder and the view layer
`MaterialCaptureContent.swift` — material-mode subview of GaussianCaptureView with status icon, plane-status-driven action area (Lock Plane / Unlock), Material Settings sheet trigger, baking progress chip
`LOTABinaryPLYReceiverV2.tox` TouchDesigner component — simpler drop-in receiver for binary PLY streaming. The original LOTABinaryPLYReceiver required manual configuration of frame parsing parameters; v2 auto-detects the binary header and exposes only the receiver port and Script TOP target. Available at the [LOTA docs page](https://lidarota.app)

Changed

ExportFormat enum gains .material case + new isSingleShot: Bool flag (cleaner than overloading needsPoses for capture-mode UX branching — also leaves room for future single-shot formats like SHARP Splat in v1.3)
GaussianRecorder defensively rejects .material in startRecording and stopRecording — material capture goes through MaterialRecorder + PBRMapBaker, never GaussianRecorder
GaussianCaptureView body branches on exportFormat == .material: format-aware top counter row (keyframe/point counters for 3D-scene formats, plane-status chip + framing dimensions for material), bottom button morphs between red toggle-record and a shutter button keyed on format.isSingleShot. .onChange(of: exportFormat) resets material state on every format switch
ARKit world-tracking config now always enables [.horizontal, .vertical] plane detection (used by Material; ignored by other modes — minor CPU cost on modern A-chips)
MetalRenderer gains 7 new pipeline slots (delight, roughness, normalFromDepth, aoBake, jointBilateralUpsample, orthoHomography, pbrSphere render). Internal access so PBRMapBaker can dispatch them
StreamingSettings gains 6 persisted material-capture properties (resolution, normal convention, AO sample count, delight strength, roughness scale, default metallic) plus a NormalMapConvention enum
CaptureManager holds materialState + materialRecorder + pbrMapBaker. New setTorch(_:) and setAutoFocus(_:) helpers wire AVCaptureDevice configuration to closures injected on MaterialRecorder. New nextFrame() async helper resolves a continuation from the next session-delegate callback, gated on hasPendingFrameAwaiter so we don't dispatch a Task per ARFrame
Binary PLY format is now the recommended path. The user guide and TouchDesigner companion documentation only document the binary receiver going forward (see Deprecated)

Fixed

Page swipes are now blocked while streaming is active. Previously, swiping the TabView mid-stream could tear down the active session and leave receivers stuck. Now matches the existing UX where the Settings button is also disabled while streaming
ARFrame retention warnings during Material capture. The session delegate's per-frame Task { @MainActor } dispatches were piling up faster than the main actor could drain them, tripping ARKit's "delegate is retaining N ARFrames" backpressure warning and eventually causing the camera to stop delivering frames (followed by <<<< FigCaptureSourceRemote >>>> errors and SIGKILL). Three layered fixes: (1) extract plane info into a value-type DetectedPlaneInfo instead of holding ARPlaneAnchor refs in pending Tasks; (2) only publish state.lockedPlaneTransform / lockedPlaneExtent on actual state transitions, not on every frame; (3) MaterialRecorder.processFrame early-returns synchronously when the locked plane is still in frame.anchors — no Task hop, no main-actor work. The steady state during a Material capture (locked → framed → baking) now does zero per-frame main-actor dispatches
Bake hung at `addCompletedHandler`. The completion handler reliably failed to fire even after cb.status reached .completed — verified by polling. Replaced with a 5 ms cb.status polling loop in waitForCompletion. Bake now reliably completes in ~220 ms
`PBRMapBaker` is nonisolated, not @MainActor. The CIContext.render YUV→BGRA conversions (~100–200 ms each at 1920×1440) and command-buffer encoding were blocking the main actor synchronously, causing the ARFrame retention buildup above. New PBRBakeParams value struct gets pulled from StreamingSettings on the main actor before the nonisolated bake runs
Depth texture is now copied into a fresh MTLTexture instead of CV-wrapped from ARKit's pool. The shared CVPixelBuffer caused implicit dependency on ARKit's depth-pool lifecycle, which prevented the GPU completion handler from firing. The ~200 KB copy at upload time decouples our pipeline entirely
Framing rect alignment in baked output. The rect was previously inheriting ARKit's arbitrary plane-local axes, producing tilted output (wood planks at ~15° angles, etc.). Now the rect's two axes are computed from the camera's right + forward directions projected onto the plane and orthogonalized — so output "top" = away from camera, "right" = camera's right
Edge streaking in baked maps. The 60% min-extent framing rect could project well beyond the camera's visible image area at oblique angles, causing clamp_to_edge sampling to repeat the rightmost/bottommost pixels as horizontal/vertical streaks. Now the rect is centered on the camera's look-point (raycast intersection) and capped at 20 cm side length, keeping the projected corners well inside the camera image
SF Symbol typo square.grid.3x3.square.fill → square.grid.3x3.square in MaterialCaptureContent

Deprecated

CSV (text) PLY streaming is deprecated and will be removed in a future update. Binary PLY format is ~40% smaller, parses faster on the receiver, and the new LOTABinaryPLYReceiverV2.tox TouchDesigner component makes the binary path easier to integrate than the CSV one. Non-binary PLY remains functional for backwards compatibility but the user guide and TouchDesigner companion components only document the binary path going forward. The "Binary Format" toggle in Settings → Point Cloud Stream is the recommended setting for all new captures

1.1.0

April 12, 2026

Added

Device Motion capture mode: New .deviceMotion mode streams iPhone sensor data over OSC. Four individually-toggleable sensors, Acceleration enabled by default — Gyroscope, Compass Heading, and Barometric Pressure off by default. Update rate picker (30/60/100 Hz). All values stream as per-axis OSC addresses so each maps 1:1 to a TouchDesigner OSC In CHOP channel. Does not require LiDAR — works on every iPhone
Motion OSC channels (all per-axis for clean TD CHOP mapping):
Audio Analysis capture mode: New .audio mode taps the microphone via AVAudioEngine and runs real-time DSP analysis using Apple's Accelerate/vDSP framework — no third-party dependencies. Streams analysis results over OSC for sound-reactive visuals. Four channel groups individually toggleable, Levels (bass/mid/high) enabled by default
Audio Levels — continuous frequency-band energy normalized 0–1 with rolling auto-gain:
Audio Beat Detection — per-band onset detection with state-of-the-art algorithm combining log-magnitude spectral flux + one-pole smoothing (~15Hz cutoff) + peak-picking (local maxima) + variance-adaptive threshold (mean + k·stddev where k is inversely modulated by coefficient of variation) + 20th-percentile noise-floor tracking + silence energy gate + 50ms minimum inter-onset time. Output is a binary switch (0 or 1) that holds at 1.0 for 50ms after each hit — TouchDesigner-friendly for Trigger CHOP routing:
Audio Dynamics — /lota/audio/burst — fast-vs-slow envelope difference, normalized 0–1, pulse-shaped output (rises instantly on transient, decays over ~200ms)
Audio FFT Spectrum — 20 log-spaced frequency bands across 20–20,000 Hz, each normalized 0–1 via per-bin rolling max. /lota/audio/fft/0 through /lota/audio/fft/19
Scrolling line graph visualization for both Motion and Audio modes — TouchDesigner CHOP viewer aesthetic. One lane per active value, stacked vertically with dynamic sizing. Colored lines scroll left as new samples arrive. Each lane has a dim center line, separator line, and colored glyph label at the right edge following the line's current Y position. Metal-rendered, captured by NDI
Smooth sub-sample scroll animation: classic oscilloscope trick — between sample arrivals, the graph slides left by a fractional sample-width based on elapsed time. Eliminates the "choppy steps" visible at 30Hz data + 60fps render. Works identically for Motion and Audio modes
Thread-safe ring buffer access in Motion + Audio engines and renderer — NSLock-protected snapshots with generation counters so stale handler callbacks after mode switches or sensor toggles become no-ops (prevents data race crashes)
Debounced sensor toggle restarts — when a user flips a sensor/channel toggle while in Motion or Audio mode, the engine restarts 50ms later so multiple rapid toggles produce a single restart (prevents crash-loop from rapid TCC requests)
Upfront permissions flow — when the user selects Motion mode, location permission (for compass heading) is requested immediately via a system dialog. When the user selects Audio mode, microphone permission is requested immediately. If denied, a clear overlay with "Open Settings" deep-link appears. Prevents the mid-session permission prompts that happened when toggling individual sensors
`NSMotionUsageDescription` and `NSLocationWhenInUseUsageDescription` added to Info.plist — fixes TCC-enforcement crash (__abort_with_payload on com.apple.tcc.kTCCServiceMotion) that occurred when CMMotionManager.startDeviceMotionUpdates() was called directly without the usage description (ARKit's internal CoreMotion use goes through a different TCC path, so initial launch worked; direct use on sensor toggle crashed)
`GlyphAtlas` character set expansion — baked glyph set expanded from #0123456789 to #0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ so graph lane labels (BASS, MID, HIGH, DL, DM, DH, BURST, F0–F19, AX, AY, AZ, GX, GY, GZ, HD, PR, AL) render correctly. Atlas bake code unchanged — just a larger characters: string at the call site
`GlyphAtlas` UV bleed fix — expanded the per-glyph bounding rect by 1px in each direction so anti-aliasing fringe pixels aren't clipped. Labels render crisply
FFT band gradient colors — F0 (lowest freq) through F19 (highest freq) use HSV hue sweep from red → blue so operators can visually track which band is which

Changed

CaptureMode enum extended with .deviceMotion (gyroscope icon) and .audio (waveform.and.magnifyingglass icon) cases — both requiresLiDAR = false
ContentView TabView restructured to handle the new modes' permission flows without breaking existing mode switching
MetalRenderer extended with motionGraphPipeline + motionCenterLinePipeline (shared between motion and audio modes) and new motionGraphTopInset / motionGraphBottomInset properties so graph lanes don't draw behind SwiftUI overlays
StreamingSettings gained 10 new persisted properties across motion and audio mode toggles
FrameEncoder.encode routes .deviceMotion and .audio to break — both modes are OSC-only (no TCP/UDP video frames)
Camera pose OSC (/lota/camera/position, /rotation, /euler) suppressed in Motion and Audio modes so the high-rate pose data doesn't drown out sensor/audio messages in OSC In CHOP

Removed

Early prototype global_dynamic audio channel — replaced by the cleaner burst-only Dynamics output after discovering the normalization math produced confusing stuck-at-0.67 values in most audio

1.0.9

April 9, 2026

Added

Blob Tracking capture mode: New .blobTracking mode recreates the TouchDesigner Blob Track TOP look — single-color hairline rectangle outlines around connected regions of LiDAR depth pixels within a configurable depth slab, all rendered in Metal so the NDI feed carries the complete visualization. Depth-based detection means no background plate required, no lighting sensitivity, works on a moving camera. Requires LiDAR
Four base styles for the underlying camera layer:
`BlobTracker` with TD-matching lifecycle: Connected-components labeling with union-find, 4-connectivity, confidence filtering, and owned scratch buffers (zero per-frame allocation). Full lifecycle state machine matching TD's onBlobStateChange: new → lost → revived → expired. Nearest-centroid ID assignment with configurable maxMoveDistance. Lost blobs are kept for reviveTime seconds and revived with the same ID if they reappear within reviveDistance. Age tracked in seconds via CFAbsoluteTimeGetCurrent()
Detection runs off the ARKit delegate thread on a dedicated blobQueue with drop-on-busy semantics. The delegate callback only captures depth CVPixelBuffer references and dispatches — never blocked by CCL or texture upload. At most one ARFrame is pinned in-flight via the dispatch closure, eliminating ARKit's "delegate retaining N ARFrames" backpressure warning. LiDAR duplicate-frame skip via unretained pointer comparison halves CPU work at 30 Hz
Three-pass Metal rendering pipeline in drawBlobTracking:
`GlyphAtlas`: New LOTA/Capture/GlyphAtlas.swift — bakes a small r8Unorm atlas containing #0123456789 once at startup using CoreText + CGContext. Stores per-glyph metrics (UV rect, advance, bearing) for runtime cursor-walk layout. Sized at 11pt × screenScale so labels render crisp at 1:1 on retina displays. Per-frame cost is just the cursor walk (~0.1 ms CPU); GPU draws the labels in a single instanced call
TD-matching OSC output: Field names match blobtrackTOP_Class verbatim. One OSCBundle per frame, padded to 10 slots for stable TouchDesigner CHOP channel counts:
Blob tracking settings organized into subsections mirroring TD's parameter pages:
Blob count + depth range in status HUD: StatusOverlayView shows "N blob(s)" with a hex-grid icon when the mode is active. The current depth range (e.g. "0.5m – 3.0m") appears under the mode dropdown so operators can see the active slab without opening Settings
`@Published blobCount` dedupe: Only dispatches a main-actor update when the count actually changes, preventing per-frame SwiftUI body re-runs in CameraView while the count is stable

Changed

CaptureMode enum extended with .blobTracking case — circle.hexagongrid icon, "Depth-based blob detection and tracking" description, requiresLiDAR = true
MetalRenderer gained three new pipeline states (blobBasePipeline, rectPipeline, glyphPipeline), glyphAtlas reference, rect instance buffer, glyph instance buffer, and the 3-pass drawBlobTracking method. Exposes metalDevice computed property and a makeBlobLabelTexture(width:height:) factory
StreamingSettings gained blob-tracking properties across detection, constraints, revival, and visual categories
SettingsView blob section restructured into four subsections (Detection, Depth, Constraints, Revival) mirroring TD's Blob Track TOP parameter pages. ColorPicker for rectangle outline color. Protocol Info row updated to "Blob OSC: /lota/blob/* (TD-compat, 10 slots)"
FrameEncoder.encode routes .blobTracking to encodeDepth — TCP/UDP receivers get raw Float32 depth frames in blob mode (identical to .depth mode), and the structured blob metadata is OSC-only
ContentView restructured: CameraPreviewView is now applied as a .background modifier on the TabView instead of being a ZStack sibling, reducing SwiftUI hosting-controller hierarchy conflicts when popovers (Menus) are presented inside the page-style TabView

1.0.8

April 8, 2026

Added

Live Transcription capture mode: New .transcription mode in the capture dropdown. Uses iOS 26's SpeechAnalyzer + SpeechTranscriber framework for fully on-device, real-time speech-to-text. No internet required. Does not require LiDAR — works on every iPhone. New TranscriptionEngine class wraps AVAudioEngine microphone capture + SpeechAnalyzer pipeline with Swift concurrency (AsyncStream<AnalyzerInput>)
Mirrored bar waveform visualization: When transcription mode is active, the camera view is replaced with a black canvas and a 200-bar white mirrored waveform driven by live microphone RMS amplitude. Rendered in Metal (new waveformVertex/waveformFragment shaders) so it streams through NDI automatically
Live transcript SwiftUI overlay: Recognized words appear on-screen as large centered captions while speaking. Updates on partial results (as words arrive) and finalized sentences
Speech model auto-download: First-time use downloads the language model via AssetInventory.assetInstallationRequest in the background. User sees a brief delay on first recognition session per language
Microphone + Speech Recognition permission flow: Requests both permissions on first entry to transcription mode. If denied, shows a clear "Enable in Settings" message with a deep-link button to the app's settings page
Transcription OSC output: Streams recognized speech as OSC messages when OSC streaming is enabled:
Transcription TCP/UDP streaming: New FrameType.speechText = 4 wire format. FrameEncoder.encodeSpeechText(_:kind:wordIndex:timestamp:) builds a 24-byte FrameHeader + UTF-8 payload. Header's width field repurposed for speech kind (0=word, 1=partial, 2=final), height for word index. Uses the existing TCP/UDP transports — no new networking code. Dedicated transcriptionStartTime / transcriptionRelativeTimestamp() helper since transcription mode doesn't have an ARFrame.timestamp source
Drag-and-drop TouchDesigner components: Two custom TD components are now available on the [LOTA docs page](https://lidarota.app) for receiving speech data:
Per-channel OSC output toggles: New "Transcription" section in Settings with three toggles (Send Per-Word, Send Partial, Send Final). Users can pick which transcription OSC addresses to send. Defaults to all on. Toggles apply to both OSC and TCP/UDP paths
Listening indicator in status bar: Pulsing mic.fill icon + "LISTENING" label appears next to the FPS counter when the transcription engine is actively capturing audio. Uses SwiftUI animation(.repeatForever) for the pulse

Changed

Camera position OSC suppressed in transcription mode: /lota/camera/position, /lota/camera/rotation, and /lota/camera/euler are no longer sent when the capture mode is .transcription — prevents the high-rate pose data from drowning out speech messages in OSC In DAT
StatusOverlayView gained an isListening: Bool parameter (default false) and an animated microphone indicator
CaptureMode enum extended with .transcription case — mic.fill icon, "Live speech-to-text" description, requiresLiDAR = false
CaptureManager.setMode() starts/stops the transcription engine on mode enter/exit — engine never runs in the background
StreamingProtocol.FrameType extended with speechText = 4 for the new text wire format
StreamingSettings gained three persisted toggles: transcriptionSendWords, transcriptionSendPartial, transcriptionSendFinal

Fixed

Waveform amplitude ring buffer sized to 200 samples (was 128) for finer horizontal resolution without performance cost

1.0.7

April 7, 2026

Changed

Unified mode selector: Camera/Streaming page mode picker replaced with a top dropdown menu matching the Gaussian Capture and ARKit Tracking pages. All three pages now use the same dropdown pattern — icon, label, chevron, description per option
Camera page bottom bar simplified to Settings (left) and Streaming toggle (right)

Removed

ModePickerView.swift — replaced by inline dropdown in CameraView

1.0.6

April 7, 2026

Added

ARKit mesh wireframe overlay: During Gaussian Splatting and Point Cloud capture, ARKit scene reconstruction renders a semi-transparent mesh overlay on the camera feed showing scanned surfaces building up in real time. Mesh triangles are colored by depth (cyan near, purple far). Uses ARMeshAnchor geometry rendered via a dedicated Metal pipeline with alpha blending
Keyframe selection: Intelligent frame filtering during capture — only saves frames where the camera has moved at least 5cm or rotated ~5 degrees since the last saved frame. Reduces dataset size by ~90% while maintaining spatial coverage. A typical 30-second capture produces ~80 well-distributed keyframes instead of ~900 redundant frames
Blur detection: Laplacian variance sharpness filter rejects motion-blurred frames before JPEG encoding. Runs on the Y (luminance) plane at 1/4 resolution for minimal overhead (<1ms). Blurry frames from fast camera movement are automatically discarded
Auto focus lock: Autofocus is automatically disabled when recording starts and restored when recording stops. Prevents focal length drift that would invalidate the single-intrinsics assumption in COLMAP/Nerfstudio datasets. Uses ARKit's official isAutoFocusEnabled API — no tracking interruption
ZIP export: Capture sessions are compressed into a single .zip file after recording. The original folder is deleted after successful compression. File size is shown in the save summary (e.g. "12.4 MB"). Uses ZIPFoundation (SPM)
Recording timer: Live elapsed time display (mm:ss format) shown during capture alongside frame and point counters
Haptic feedback on keyframe capture: Subtle tap via UIImpactFeedbackGenerator each time a frame passes the keyframe + blur gates and is saved to the dataset
Rolling accumulation buffer: When the Metal accumulation buffer reaches 5M points, the oldest 25% are evicted via memmove and new points continue accumulating. Prevents the visual stall that occurred in 1.0.5 during long recordings

Changed

Gaussian Capture page now shows live camera feed (was point cloud) with mesh wireframe overlay during recording, replacing the previous point cloud accumulation view. Dramatically reduces GPU load (~1ms vs ~5-8ms per frame)
Frame counter label changed from "frames" to "keyframes" to reflect the filtering behavior
GaussianRecorder.recordFrame() restructured: points are only accumulated from frames that pass both keyframe and blur gates (was: all frames accumulated, causing memory exhaustion on long recordings)
stopRecording() return type changed from URL? to (URL, Int64)? to include zip file size
SaveSummaryView now shows file size and updated iCloud message ("Your capture is ready to transfer via iCloud or AirDrop")
Save progress indicator shows two phases: "Saving..." then "Compressing..."

Removed

Point cloud accumulation as the default capture visualization — replaced by camera + mesh overlay
Depth overlay / coverage painting system (depthOverlayFragment shader, updateCoverage compute kernel, coverage texture) — replaced by ARKit mesh wireframe

1.0.5

April 4, 2026

Added

Depth confidence filtering: GPU compute kernel now filters depth pixels by ARKit confidence level before unprojection. Three-level segmented picker in Settings (All / Medium+ / High Only). Default Medium+ removes noisy edge pixels with zero performance cost. Also applied to GaussianRecorder's CPU-side point extraction during capture
NDI depth side-by-side streaming: New toggle in NDI settings sends a 2x-wide frame — left half is the current camera view (any mode), right half is the depth colormap. Standard format expected by TouchDesigner and Notch artists (popularized by iDepth NDI). Uses a separate Metal render pass for the depth visualization, blitted into a double-wide IOSurface
Hand tracking via Vision framework: New "Hands" mode on the ARKit Tracking page (swipe left). Detects up to 2 hands simultaneously using VNDetectHumanHandPoseRequest on the rear camera. 21 landmarks per hand streamed over OSC, organized by finger (/lota/hand/left/thumb, /lota/hand/left/index, etc.). Color-coded overlay: left hand = teal, right hand = orange
2D/3D hand coordinate toggle: Default 2D mode streams normalized screen-space coordinates (works on all iPhones). Opt-in 3D mode projects landmarks into world space using LiDAR depth (greyed out on non-LiDAR devices). is3d OSC flag tells receivers which coordinate space is active
Hand overlay view: Real-time bone + joint dot visualization drawn over camera feed using SwiftUI Canvas. Five finger chains from wrist to fingertip per hand, with aspect-fill crop correction for accurate alignment

Changed

Tracking settings sections consolidated from separate "Body Tracking" and "Face Tracking" into unified "Tracking" section with Skeleton Overlay, Face Overlay, Hand Overlay, and 3D Hand Coordinates toggles
TrackingMode enum extended with .hands case, added to visible mode picker alongside Body and Face
Hand detection runs on existing Vision dispatch queue (throttled every 2nd frame for ~30fps updates), uses ARWorldTrackingConfiguration with .sceneDepth to keep LiDAR available
Video transports (NDI, TCP, UDP, PLY) disconnected in hand mode for consistency with body/face tracking — OSC only on tracking page
Hand OSC data organized by finger: /lota/hand/{side}/wrist (3 floats), /lota/hand/{side}/thumb (12 floats), /lota/hand/{side}/index (12 floats), /lota/hand/{side}/middle (12 floats), /lota/hand/{side}/ring (12 floats), /lota/hand/{side}/pinky (12 floats)

1.0.4

April 3, 2026

Added

Metal compute shader for depth unprojection: Replaced CPU-side nested loop with a GPU compute kernel (unprojectDepth) that processes every depth pixel in parallel. Dispatches over actual depth texture dimensions — no hardcoded resolution assumptions
Dynamic ring buffer allocation: Point cloud ring buffer is lazy-allocated on first depth frame using the actual depth map resolution (CVPixelBufferGetWidth * Height). Automatically adapts to any LiDAR hardware without code changes
Full depth map sampling: Every pixel of the 256x192 depth map is now unprojected (previously sampled every 2nd pixel in both axes, discarding ~75% of depth data). ~4x spatial density increase
Max Depth setting: User-configurable maximum LiDAR depth range (1–10m, default 5m) replaces the hardcoded 5.0m filter. Gen 2 devices (iPhone 15/16 Pro, IMX591) can extend to ~10m. Applies to both live point cloud and Gaussian capture
Binary PLY encoding: Optional packed binary wire format for PLY streaming — 4-byte LE count header + 15 bytes/point (3 floats + 3 uint8 RGB). ~40% smaller than CSV text format. Toggle in Settings → Point Cloud Stream → Binary Format (default off)
Compute Quality setting: Segmented picker (Full / Balanced / Efficient) controlling GPU compute frame skip. Full = every frame, Balanced = every 2nd, Efficient = every 3rd. Defaults to Balanced to reduce thermal load
Adaptive thermal throttle: Automatically increases compute frame skip (max 4) when measured FPS drops below 50, recovers when FPS exceeds 55. Hysteresis gap prevents oscillation. Only active in Point Cloud mode
Depth change detection: Compares CVPixelBuffer identity between frames to skip redundant compute dispatches. LiDAR hardware runs at 30fps but ARKit interpolates to 60fps — half the frames reuse the same depth buffer. Zero-cost check, zero quality loss
TouchDesigner binary receiver: tools/td_pointcloud_receiver.py — high-performance binary point cloud receiver using numpy bulk parsing + Script TOP textures + GPU instancing. Handles 49K+ points at 60fps with ~1–2ms Python overhead

Changed

Point cloud unprojection moved from CPU (writePointsToSlot) to Metal compute shader (computePointsGPU), enabling full depth map sampling at interactive frame rates
Frame throttle removed from both MetalRenderer (was every 3rd frame) and GaussianRecorder (was every 3rd frame) — every ARKit frame is now processed
GaussianRecorder.extractPoints now samples every depth pixel (step=1, was step=2)
pointCloudMaxPoints setting removed (was 500–12,500 slider) — replaced by full depth map sampling with Max Depth range control
Ring buffer slot offset calculation uses dynamic depthMapPixelCount instead of hardcoded pointsPerFrame constant
PLY encoder (PLYEncoder.swift) now supports both CSV text (encode) and packed binary (encodeBinary) formats
Settings UI: "Max Points" slider replaced with "Max Depth" slider; added "Compute Quality" segmented picker; added "Binary Format" toggle under Point Cloud Stream
Deprecated td_ply_receiver.py (CSV/Script SOP) in favor of td_pointcloud_receiver.py (binary/GPU instancing)

Removed

pointsPerFrame hardcoded constant (12,500) — buffer capacity is now derived from depth map resolution at runtime
activeMaxPoints property and setMaxPoints() — no longer needed with full depth map sampling
pointCloudFrameCounter — replaced by thermal management frame skip system
recordFrameCounter in GaussianRecorder — frame throttle removed

1.0.3

April 1, 2026

Fixed

Face blend shape OSC channels now use descriptive names: TouchDesigner OSC In CHOP previously showed generic numbered channels (lota/face/blendshapes1, blendshapes2, …) because all 52 blend shapes were packed into a single OSC message. Each blend shape is now sent as its own named OSC address (e.g. /lota/face/browDown_L, /lota/face/eyeSquint_R), matching the labels shown in the app's face overlay.

Changed

Face blend shapes sent as an OSC bundle (one UDP datagram containing 52 individually addressed messages) instead of a single flat message with anonymous float arguments
Added OSCBundle struct to OSCEncoder.swift implementing the OSC 1.0 bundle wire format (#bundle\0 + timetag + size-prefixed messages)

1.0.2

March 31, 2026

Added

ARKit Body Tracking: Real-time 3D skeleton detection (91 joints) via rear camera using ARBodyTrackingConfiguration, streamed as 18 key joints over OSC at /lota/body/skeleton, /lota/body/root, /lota/body/detected
ARKit Face Tracking: Real-time facial motion capture (52 blend shapes) via front camera using ARFaceTrackingConfiguration, streamed over OSC at /lota/face/<blendShapeName>, /lota/face/transform, /lota/face/detected
ARKit Tracking page: New swipe-left page with Body/Face mode selector, live detection status badges, settings and transmit controls — mirrors the existing camera and Gaussian capture page pattern
Skeleton overlay: Real-time bone + joint dot visualization drawn over the camera feed in Body mode, projecting 3D skeleton joints to 2D screen coordinates via ARCamera.projectPoint
Face blend shape overlay: Live bar graph display of the top 8 most active facial blend shapes in Face mode
Overlay settings: Skeleton Overlay and Face Overlay toggles in Settings with UserDefaults persistence
Mocap mode (implemented but hidden): Simultaneous face + body tracking using Vision framework body detection on rear camera with LiDAR depth projection for approximate 3D joints — can be re-enabled by changing TrackingMode.visibleCases
TrackingMode enum (LOTA/Models/TrackingMode.swift) with body/face/mocap modes, device capability checks
ARKitTrackingView (LOTA/Views/ARKitTrackingView.swift) with full accessibility: VoiceOver labels, Voice Control input labels, Differentiate Without Color support, high-contrast colors, combined accessibility elements
SkeletonOverlayView (LOTA/Views/SkeletonOverlayView.swift) for body skeleton rendering
FaceOverlayView (LOTA/Views/FaceOverlayView.swift) for blend shape visualization
Debounced VoiceOver tracking announcements (500ms) to prevent spam from noisy detection at frame edges
Protocol info in Settings now shows Body OSC and Face OSC entries

Changed

ContentView extended to three-page TabView: ARKit Tracking (left) <- Camera (center) -> Gaussian Capture (right)
Page transitions now track previousPage to handle fast swipes that skip intermediate pages
Page transitions blocked while Gaussian recording is active to prevent session reconfiguration mid-capture
previousStreamingMode only saved when leaving the camera page (prevents overwrite from other pages)
CaptureManager handles ARKit session reconfiguration: enterTrackingMode() switches to body/face config and disconnects video transports (OSC stays alive); exitTrackingMode() restores world tracking without .resetTracking to preserve world map continuity
Video transports (TCP/UDP/NDI/PLY) disconnected from renderer when entering tracking mode (incompatible frame format), reconnected on return
isReconfiguring state drives a loading indicator and disables streaming toggles during ~0.5s session switch window
Settings navigation title changed from "Streaming" to "Settings"
OSCMessage and OSCArg marked Sendable, encoded() marked nonisolated — resolves 18 actor isolation warnings
dlog marked nonisolated — resolves main actor isolation warning
Replaced deprecated UIScreen.main.nativeScale with UITraitCollection.current.displayScale

Planned

Mesh export and point cloud settings enhancements

0.3.0

March 25, 2026

Added

Enhanced Gaussian Splat capture workflow with improved recording controls
Nerfstudio exporter (NerfstudioExporter.swift) for alternative 3D reconstruction pipelines
ARKit body tracking plan for future implementation
ARKit Gaussian Splat enhancement plan
NDI SDK vendor headers and static library (Vendor/NDI/) to resolve Xcode Cloud build failures

Changed

Improved GaussianRecorder with extended capture capabilities
Enhanced MetalRenderer with additional rendering features
Updated GaussianCaptureView UI for better capture workflow
Updated CaptureManager with new capture options

0.2.0

March 24, 2026

Added

Accessibility features: Voice Control support and Reduced Motion preferences
AccessibilityHelpers.swift utility for centralized accessibility support
Accessibility labels and hints across all views (CameraView, GaussianCaptureView, ModePickerView, StatusOverlayView)
UI test suite (LOTAUITests.swift)

Changed

Improved ModePickerView with accessible navigation
Enhanced StatusOverlayView with VoiceOver support
Updated SettingsView with accessibility considerations

0.1.0

March 22, 2026

Added

Initial project build and app structure
Capture system: CaptureManager, GaussianRecorder, COLMAPExporter, LiDARCapability
Rendering pipeline: MetalRenderer, Metal shaders (Shaders.metal), DepthColorMap, CameraPreviewView
Streaming: Full streaming stack — StreamingService, FrameEncoder, NDITransport, TCPTransport, UDPTransport, OSCTransport, OSCEncoder, PLYEncoder
Views: CameraView, GaussianCaptureView, ModePickerView, SettingsView, StatusOverlayView, FolderPickerView
Models: CaptureMode, StreamingSettings
App icon (light and dark variants)
ContentView as main navigation hub
Receiver tools: lota-receiver.swift and td_ply_receiver.py
README with project overview

0.0.1

March 18, 2026

Added

Initial commit — project scaffolding and Xcode project setup

Changelog

1.2.8

1.2.7

1.2.6

1.2.5

1.2.4

1.2.3

1.2.2

1.2.1

1.2.0

1.1.0

1.0.9

1.0.8

1.0.7

1.0.6

1.0.5

1.0.4

1.0.3

1.0.2

0.3.0

0.2.0

0.1.0

0.0.1

Get Development Updates