Introduction
Getting Started
LOTA turns your iPhone's LiDAR sensor into a professional spatial capture and streaming tool. Stream depth, color, and point cloud data over the network in real time, capture datasets for 3D reconstruction, and stream motion capture data — no extra hardware required.
Requirements
- • iPhone 12 Pro or later (LiDAR required for Depth, Point Cloud, Blob Track, and Gaussian Capture)
- • Color, Mono, Transcription, Motion, and Audio modes work on all iPhones
- • iOS 26.2 or later
- • Wi-Fi network (for streaming features)
Install LOTA and complete first launch
Download LOTA from the App Store. The first time you open the app, a welcome screen lists the four iOS permissions LOTA needs (Camera, Microphone, Speech Recognition, Location for compass heading) plus a privacy reassurance — LOTA doesn't track or store anything; captures stream over your local network or save to the iCloud folder you choose. Tap Continue and the four OS permission dialogs appear in sequence.
Take the 9-step guided tour
After permissions, a hybrid welcome card → spotlight tour walks you through the three pages, the mode picker, the status bar, mode-specific settings, and the transmit button. Tap anywhere to advance, or use the Skip button at the bottom-left. Replay anytime from Transmission Settings → Help → Replay Tutorial.
Choose a capture mode
Tap the mode picker at the top — a glass capsule showing the active mode's icon next to a 3×2 dot-grid glyph, mirroring the iOS Camera app's "more controls" affordance. A glass panel drops down with all 8 modes laid out in a 2×4 grid of circular icon buttons (Color, Mono, Depth, Point Cloud, Blob Track, Transcription, Motion, Audio). Active mode is highlighted yellow; LiDAR-required modes are dimmed and disabled on non-LiDAR devices. Each mode activates instantly — no restart required.
Start streaming or recording
Tap the transmit button (bottom right) to broadcast over the network — all enabled transports send simultaneously. Configure transports by tapping the floating status bar pill at the top (opens Transmission Settings). Tweak per-mode options via the <Mode> Settings button just under the status bar. Swipe right to access Gaussian Capture for recording 3D datasets or capturing PBR materials.
Capture
Capture Modes
LOTA provides eight distinct capture modes on the Camera / Streaming page. Tap the mode picker at the top (glass capsule with the active mode's icon next to a 3×2 dot-grid glyph) to open a glass panel laid out as a 2×4 grid of circular icon buttons. Active mode is highlighted yellow; LiDAR-required modes are dimmed on non-LiDAR devices. Tap outside the panel to dismiss without picking.
Color
Mono
Depth
Point Cloud
Blob Track
Transcription
Motion
Audio
Switching modes
Tap the mode picker at the top of the screen and select a cell from the dot-grid panel, or say switch to Depth with Voice Control enabled. Switching is instant and does not interrupt an active stream.
Blob Track mode
Select Blob Track from the mode picker to turn the iPhone into a wireless TouchDesigner-compatible blob tracker. LOTA carves a configurable depth slab out of the LiDAR scan, finds connected regions of in-range pixels, assigns each one a stable ID across frames, and streams the result as both a video (NDI) and per-blob metadata (OSC). Replaces a Kinect + Blob Track TOP chain with a single iPhone — no background plate, no lighting calibration, works on a moving camera.
While active
- Blob count HUD — A
N blob(s)count appears in the status bar with a hex-grid icon - Live depth range — The current slab (e.g.
0.5m – 3.0m) is shown just below the status pill so operators can verify the active range without opening Settings - Hairline outlines — Each detected blob gets a 1-pixel rectangle drawn over the camera feed in your chosen color
- NDI captures everything — The full composition (base layer + rectangles + optional ID labels) is captured by NDI so receivers see exactly what the phone shows
- Background detection — Connected-components labeling runs off the ARKit delegate thread so the main thread stays responsive for touch and UI
Base styles
Set in Blob Track Settings → Detection (tap theBlob Track Settings button below the status bar). Controls what the underlying camera layer looks like behind the rectangles.
- Color (default) — Live color camera feed. Operator-friendly view, see what you're filming
- Mono — Grayscale camera feed. Cleaner look, no color distractions
- Mask— Grayscale subject silhouette on black. Verifies what's being detected and hides the background
- Binary — Pure white silhouette on black. Authentic TouchDesigner Blob Track TOP look — maximum contrast
Toggle Draw Blob Bounds off to see only the base layer. Toggle Show ID Labels on to draw#1, #7, etc. labels just outside the top-right corner of each bbox. Both rectangles and labels use the Blob Color picker.
OSC addresses (TD-compatible)
Field names match TouchDesigner'sblobtrackTOP_Class verbatim. Every message is padded to 10 slots so CHOP channel counts stay stable as blobs come and go — empty slots have id == 0 for filtering via a Select CHOP.
/lota/blob/count— active blob count (int)/lota/blob/ids— 10 stable tracker IDs/lota/blob/u,/lota/blob/v— normalized centroid (10 floats each)/lota/blob/width,/lota/blob/height— normalized bbox size (10 floats each)/lota/blob/tx,/lota/blob/ty— pixel centroid in 256×192 depth space (10 ints each)/lota/blob/age— seconds tracked (10 floats)/lota/blob/state—0=new, 1=revived, 2=lost, 3=expired(10 ints)
Lifecycle states
A blob fires state = 0 (new)on first detection, then is silent while it's actively tracked. If it disappears, it transitions to lost and is held in the revival window for Revive Time seconds. If it reappears within Revive Distance of where it was last seen, it gets the same ID back and fires revived. Otherwise it transitions to expiredand is permanently dropped. This matches TouchDesigner's blob lifecycle exactly.
TCP / UDP fallback
In blob mode, TCP/UDP carries raw Float32 depth maps (the same format as the Depth capture mode), so receivers that want to do their own client-side analysis get the raw data. The structured per-blob metadata is OSC-only.
Transcription mode
Select Transcriptionfrom the mode picker to turn your iPhone into a wireless live speech-to-text source for creative tools. The camera view is replaced with a black canvas and a 200-bar mirrored waveform driven by the microphone. Recognized words appear on screen as live captions. Uses iOS 26's on-deviceSpeechAnalyzer framework — fast, private, offline.
First-time setup
Microphone and Speech Recognition permissions are requested upfront on first launch (along with Camera and Location), and the on-device speech model for your language is prefetched in the background. By the time you open this mode for the first time, transcription is usually ready to go — no model-download wait. If you denied either permission, the mode displays a message with a shortcut to iOS Settings to grant access. If the prefetch didn't finish or failed, the mode falls back to downloading on first use.
While active
- Listening indicator — A pulsing microphone icon and "LISTENING" label appear in the status bar
- Waveform — 200 mirrored bars pulse with your voice in real time and stream through NDI as a black-and-white video source
- Live captions — Recognized words appear on screen as large centered text as they arrive
- Camera tracking OSC suppressed — Camera position messages are automatically paused so speech data stands out in your OSC receiver
- Clean exit — Leaving transcription mode stops the audio engine completely. No background microphone access.
OSC addresses
When OSC streaming is enabled, transcription sends these addresses:
/lota/speech/word— each recognized word (string)/lota/speech/word_count— incrementing counter (int, visible in OSC In CHOP)/lota/speech/partial— running partial transcript (string)/lota/speech/final— finalized sentences (string)
String messages arrive in TouchDesigner's OSC In DAT (not OSC In CHOP — CHOP only handles numeric channels). The word_count integer is the only speech message visible in OSC In CHOP, useful for signal-flow monitoring.
TCP / UDP binary wire format
For custom integrations, LOTA also sends recognized speech as binary frames over TCP and UDP:FrameType.speechText = 4, a 24-byte FrameHeader+ UTF-8 payload. The header's width field is repurposed for speech kind (0 = word, 1 = partial, 2 = final), height for word index. Use the drag-and-drop TD components below to parse this format automatically.
Which transport should you use?
- OSC — Easiest for TouchDesigner, Max/MSP, Resolume. Drop in an OSC In DAT, set the port, done.
- TCP / UDP— Use the LOTASpeechTCP / LOTASpeechUDP components (downloads in the TouchDesigner section below) for plug-and-play binary parsing, or integrate into custom apps, game engines, and scripts that don't have OSC parsers.
- NDI — When you want the waveform visual as a live video source for reactive effects or broadcast overlays.
Motion mode
Select Motion from the mode picker to turn the iPhone into a wireless motion-sensor OSC source. LOTA reads the device motion sensors and streams them as OSC. Each active sensor value also renders as its own scrolling line graph lane on screen so operators can see the data in real time — the full graph composition is captured by NDI. Works on every iPhone — no LiDAR required.
Sensors (individually toggleable in Settings)
- Acceleration (ON by default) — Gravity-removed, G-force units. Addresses
/lota/motion/accel/x,/y,/z - Gyroscope — Rotation rate in rad/s. Addresses
/lota/motion/gyro/x,/y,/z - Compass Heading — Degrees 0–360 (requires Location permission). Address
/lota/motion/heading - Barometric Pressure — kPa plus relative altitude in meters from session start. Addresses
/lota/motion/pressureand/lota/motion/altitude
Update rate picker: 30 / 60 / 100 Hz. 30 Hz matches ARKit. 100 Hz is useful for latency-critical controllers.
Permissions
NSMotionUsageDescription; heading uses NSLocationWhenInUseUsageDescription. If denied, the relevant sensors silently skip (compass stays at −1, etc.) — no crash. Permissions are requested upfront when entering the mode, not mid-session.Scrolling graph aesthetic
AX,AY, AZ, GX, GY,GZ, HD, PR, AL) that follows the line's current Y position. Between sample arrivals, the graph slides left by a fractional sample-width (sub-sample scroll) so the visual feels continuous even at 30 Hz data rates. Rendered in Metal and captured by NDI.Use cases
- Phone as a wireless tilt / shake / toss controller for reactive visuals in TouchDesigner, Resolume, or Max/MSP
- Environmental sensing — barometric pressure drift and altitude tracking for installations
- Wearable motion source — phone clipped or strapped to a performer streams tilt, rotation rate, and compass heading as OSC
- Camera pose OSC is automatically suppressed in Motion mode so sensor channels stand out cleanly in an OSC In CHOP
Audio mode
Select Audio from the mode picker to turn the iPhone into a wireless real-time audio analysis source. LOTA taps the microphone via AVAudioEngineand extracts musically relevant features using Apple's Accelerate/vDSP framework — zero third-party dependencies, fully on-device. Each active channel renders as a scrolling graph lane and is captured by NDI. Works on every iPhone — no LiDAR required.
Four channel groups (individually toggleable)
- Levels (ON by default) — Continuous 0–1 energy per frequency band with rolling auto-gain.
/lota/audio/bass,/lota/audio/mid,/lota/audio/high - Beat Detection — Binary 0/1 switch per band on detected onset (holds at 1.0 for 50 ms after each hit).
/lota/audio/drums/low,/lota/audio/drums/mid,/lota/audio/drums/high - Dynamics — Fast-vs-slow envelope difference, pulse-shaped 0–1 (rises instantly on transient, decays over ~200 ms).
/lota/audio/burst - FFT Spectrum — 20 log-spaced frequency bands across 20–20,000 Hz, each normalized 0–1 via per-bin rolling max.
/lota/audio/fft/0through/lota/audio/fft/19
Update rate: 30 / 60 Hz. FFT and analysis run internally at ~86 Hz and decimate to the user's output rate.
Frequency bands
- BASS (20–200 Hz) — kick, sub-bass, bass guitar fundamentals
- MID (200–2000 Hz) — vocals, snare body, guitars, most instruments
- HIGH (2000–8000 Hz) — snare attack, hi-hats, vocal consonants, cymbals
Beat detection algorithm
Multi-stage onset detection designed to be accurate on dense, dynamic music without false-triggering on silence:
- 1. Log-magnitude spectral flux per band (more perceptually aligned than raw flux)
- 2. One-pole lowpass smoothing (~15 Hz cutoff)
- 3. Peak-picking on local maxima — catches rapid drum fills that threshold-only methods miss
- 4. Adaptive threshold
noise_floor + k·stddevwherekis inversely modulated by coefficient of variation (dense music gets lower k, steady signals get higher k) - 5. 20th-percentile noise-floor tracking over ~3 s (robust to sustained beats)
- 6. Silence energy gate per band — no beats when the band itself is quiet
- 7. 50 ms minimum inter-onset time — supports up to ~1200 BPM / 32nd notes
Drum channels are binary (0 or 1) — trivial to route into a TouchDesigner Trigger CHOP or Logic CHOP without thresholding.
Permissions & limitations
On first entry to Audio mode, iOS prompts for Microphone access. If denied, a clear overlay with an "Open Settings" deep-link appears and no audio engine starts.
System audio limitation:LOTA analyzes the microphone input only. It cannot capture system audio (Spotify, Apple Music, YouTube) because iOS sandboxing doesn't expose system audio to third-party apps. To analyze music playing on the iPhone, play it through the speakers — LOTA's mic picks it up naturally. For high-fidelity analysis, play audio from a separate source into the iPhone's mic.
TouchDesigner pattern
OSC In CHOP, point it at the phone's IP + port 9000. Enable Levels (default). Channels bass, mid, high appear and start pulsing. Toggle Beat Detection on → drums_low, drums_mid, drums_high appear as clean 0/1 signals you can route into a Trigger CHOP for beat-synced effects. FFT bands render with a red → blue color gradient (F0 = lowest, F19 = highest) so you can see where energy is on the spectrum at a glance.Network
Streaming
Tap the transmit button (bottom right) to start streaming. All enabled transports send simultaneously. Configure transports by tapping the floating status bar pill at the top of the screen — this opens Transmission Settings, a focused sheet with Receiver IP, TCP/UDP, NDI, OSC, Point Cloud Stream, and Protocol Info sections. While streaming, the per-protocol chip in the status bar (OSC, NDI, TCP/UDP, PLY) goes from dim white to red, and a small dim text block at the bottom of the screen lists each active transmission as <protocol> <host>:<port> so you can confirm settings at a glance.
NDI
LOTA (iPhone) on your network and is auto-discovered by TouchDesigner, OBS, vMix, Resolume, and any other NDI-compatible receiver. No IP configuration needed. Optional side-by-side mode sends a 2x-wide frame with camera view on the left and depth colormap on the right.TCP / UDP
Float32 depth maps for Depth and Point Cloud modes. TCP reconnects automatically; UDP is fire-and-forget. Set the destination IP and port in Settings.OSC
Streams real-time camera tracking data at ~30 Hzover UDP. Point any OSC-capable tool at LOTA's IP and port to receive:
/lota/camera/position— x, y, z (3 floats)/lota/camera/rotation— quaternion x, y, z, w (4 floats)/lota/camera/euler— pitch, yaw, roll (3 floats)/lota/mode— current capture mode (1 Hz)/lota/fps— current frame rate (1 Hz)
PLY (live point cloud)
9848 by default. Pair it with the LOTABinaryPLYRecieverV2.tox drop-in component (TouchDesigner section below) — it auto-detects the binary header and only asks for the port and a Script TOP target. A legacy CSV text mode remains available for custom receivers but is deprecated and will be removed in a future update.How to start streaming
Open Transmission Settings (tap the status bar pill)
Tap the floating status bar pill at the top of the screen. Enable the protocols you need and set the destination IP (shared across all transports) and port for each. NDI requires no configuration.
Connect to the same Wi-Fi network
LOTA and your receiving machine must be on the same local network. A 5 GHz network is recommended for lowest latency.
Tap the stream button (bottom right)
All enabled protocols start simultaneously. The status bar shows a live indicator for each active protocol.
Motion Capture
ARKit Tracking
Swipe left from the main camera page to access ARKit motion capture. Three tracking modes stream data over OSC in real time.
Body Tracking
3D skeleton detection via the rear camera. Tracks 91 ARKit joints, streams 18 key joints over OSC. Visual overlay shows white bones with green joint dots. Requires A12 chip or later.
/lota/body/skeleton— 18 joint positions/lota/body/root— root transform/lota/body/detected— detection state
Face Tracking
52 facial blend shapes captured via the front-facing TrueDepth camera. Overlay shows a bar graph of the top 8 most active blend shapes. Requires TrueDepth camera (iPhone X or later).
Each blend shape is sent as its own named OSC address (e.g. /lota/face/browDown_L, /lota/face/eyeSquint_R) bundled in a single UDP datagram.
Hand Tracking
Detects up to 2 hands simultaneously via the rear camera using the Vision framework. 21 landmarks per hand, streamed over OSC organized by finger. Overlay shows bone chains with joint dots — teal for left hand, orange for right.
/lota/hand/{left|right}/wrist— 3 floats/lota/hand/{left|right}/thumb— 12 floats/lota/hand/{left|right}/index— 12 floats/lota/hand/{left|right}/middle— 12 floats/lota/hand/{left|right}/ring— 12 floats/lota/hand/{left|right}/pinky— 12 floats
Hand coordinate modes
2D (default) — normalized screen-space coordinates. Works on all iPhones. 3D (opt-in)— world-space coordinates projected via LiDAR depth. Requires a LiDAR device. Toggle in Hands Settings → 3D Hand Coordinates (tap theHands Settings button below the status bar on the ARKit Tracking page).
Integration
TouchDesigner
LOTA is built with TouchDesigner workflows in mind. Stream live depth, color, and point cloud data directly into your TD project with no capture cards or expensive rigs.
LOTABinaryPLYRecieverV2.tox — Binary Point Cloud Receiver
Drop-in TouchDesigner receiver for LOTA's binary PLY stream. Auto-detects the binary header — only asks for the receiver port and a Script TOP target. Uses numpy bulk parsing and GPU instancing to handle 49K+ points at 60 fps. Enable Binary Format in Transmission Settings → Point Cloud Stream (tap the floating status bar pill) and point this component at the same port.
LOTASpeechTCP.tox — Speech TCP Receiver
TCP/IP DAT with a callback script that parses LOTA's binary speech frame format and writes recognized words, partials, and finals to a speech_log Table DAT. Handles stream buffering for frames split across TCP packets. Drop into your network and recognized speech starts populating rows — no manual parsing required.
LOTASpeechUDP.tox — Speech UDP Receiver
UDP In DAT configured for "One Per Message" mode with a callback script that parses the same binary speech frame format as the TCP receiver. Simpler than TCP — each datagram is a complete frame, so no buffering is needed. Use this for low-latency fire-and-forget speech delivery on a local network.
NDI input (easiest)
Drop an NDI InTOP into your project. LOTA appears in the source list automatically. Select it, and you're receiving live video.
Point cloud via LOTABinaryPLYRecieverV2.tox
- Drop
LOTABinaryPLYRecieverV2.toxinto your TouchDesigner network. - In LOTA, enable
PLY StreamingandBinary Formatin Transmission Settings → Point Cloud Stream and note the port. - On the component, set the receiver port to match and wire its output Script TOP into your render network — live 3D geometry is ready to instance.
Camera tracking via OSC
Add an OSC InCHOP. Set the port to match LOTA's OSC port. You'll receive camera position (x, y, z), rotation (quaternion), and euler angles at 30 Hz — perfect for driving a virtual camera or triggering effects based on device movement.
3D Export
Export & 3D Pipelines
Swipe right to the Gaussian Capture page. Record datasets with ARKit intrinsics, extrinsics, and LiDAR point clouds — ready for training, viewing, or post-production.
COLMAP
cameras.bin, images.bin, and points3D.bin in COLMAP binary format. Compatible with OpenSplat, gsplat, and Nerfstudio for training Gaussian Splats directly from your iPhone captures.Nerfstudio
transforms.json + images/ JPEGs + points3D.ply. Ready for Nerfstudio, splatfacto, and Instant-NGP training pipelines.Nerfstudio + Depth
depth/ folder. Best for depth-supervised training — produces better geometry on flat surfaces and uniform areas.Point Cloud (PLY)
points3D.ply file with accumulated points from your session. Open in Blender, CloudCompare, MeshLab, or any tool that reads PLY.Material (PBR)
basecolor.png (sRGB), normal.png, height.png (16-bit), ao.png, roughness.png, preview.png, and a material.json manifest with patch size and tiling hints. Drop into Substance Designer/Painter, Blender, Unreal, Unity, or TouchDesigner. Requires LiDAR.What happens during recording
- Mesh overlay — A semi-transparent wireframe shows scanned surfaces building up in real time (cyan near, purple far)
- Keyframe selection— Only frames where the camera moved at least 5cm or rotated ~5° are saved, producing a well-distributed set of training views
- Blur detection — Motion-blurred frames are automatically rejected via Laplacian variance analysis
- Focus lock — Autofocus is disabled during recording to keep camera intrinsics consistent across all frames
- Haptic feedback — A subtle tap each time a keyframe is captured
- Counters — Elapsed time, keyframe count, and total point count shown on screen
When you stop recording
- 1. Dataset files are written (metadata, point cloud, images)
- 2. Everything is compressed into a single
.zipfile - 3. A summary shows format, keyframes, points, file size, and filename
- 4. The zip syncs to iCloud automatically if you chose an iCloud folder
Tips for best results
Move slowly
Walk around the subject at a steady pace. A typical 30-second scan produces ~80 keyframes and a 10–20 MB zip.
Capture from multiple angles
Shoot from low, mid, and high heights. Ensure adjacent viewpoints have 70–80% overlap for best reconstruction.
Use good lighting
Consistent, well-lit environments produce better results. Avoid reflective and transparent surfaces — LiDAR struggles with glass and mirrors.
Material Capture
Select Material from the format picker to swap the page from a recording flow to a plane-lock + shutter flow. One tap, ~½-second bake on iPhone 15/16 Pro at 1024², ZIP saved to your iCloud folder. Requires LiDAR.
When to use Material vs the other formats
- 3D scan of an object or room you can move around — use COLMAP, Nerfstudio, or Point Cloud
- Tileable PBR texture of a flat surface (floor, wall, table, fabric, brick…) — use Material
Pick Material from the format picker
The page swaps to the material-capture UI: status icon, plane-status chip, Lock Plane action, and shutter button.
Pick an export folder
Tap the folder icon (bottom left) to choose an iCloud Files folder if you haven't already.
(Optional) Open Material Settings
Adjust output resolution (512 / 1024 / 2048), AO sample count (32 / 64 / 128), normal convention (OpenGL or DirectX), delight strength, and roughness scale. All persist via UserDefaults.
Point at a flat surface and tap Lock Plane
ARKit raycast detects horizontal and vertical planes. Lock Plane snaps a 20 cm screen-aligned square patch centered on what you're pointing at — output "top" = away from camera, "right" = camera's right.
Tap the shutter
The torch fires a brief flash-pair sequence (~200 ms), autofocus locks for both grabs so they share the same focal distance, then the bake runs (~220 ms at 1024² with 64 AO samples on iPhone 15/16 Pro).
Save to Files
The Material Save Summary sheet shows a live PBR sphere preview, file-size estimate, and a metallic toggle. Tap Save — ZIP is written to your iCloud folder, the sheet auto-dismisses, and the plane lock stays active for the next shot.
What's in the ZIP
basecolor.png— sRGB 8-bit albedo, white-balanced and de-lit via flash-pair specular subtractionnormal.png— linear 8-bit tangent-space normal from the LiDAR depth gradient (OpenGL +Y up by default; DirectX selectable in Material Settings)height.png— linear 16-bit, plane-relative ±25 mm range mapped to UInt16ao.png— linear 8-bit horizon-based ambient occlusion baked from the height maproughness.png— linear 8-bit per-texel estimate from flash-pair specular sharpness, multiplied by your roughness scale sliderpreview.png— sRGB 8-bit Cook-Torrance BRDF sphere render of the captured materialmaterial.json— manifest withplaneMeters(physical patch size),tilingHintFor1mSquare, normal convention, roughness method, metallic value, device hardware identifier, LiDAR generation, capture date, and gyro drift between flash pair
Receiver workflows
- Substance Designer — drop the unzipped folder onto a new graph; Substance auto-creates a Material node with all six channels wired
- Substance Painter — drag the folder into Shelf → Materials, then apply to a mesh
- Blender — in the Shading editor, add an Image Texture per file and wire to a Principled BSDF; set non-color spaces on normal, roughness, ao, and height
- Unreal Engine — drop the folder into the Content Browser → Unreal auto-creates a Material Instance. Switch normal convention to DirectX in Material Settings before capture so the green channel reads correctly
- Unity — standard URP / HDRP Lit shader inputs map 1:1
- TouchDesigner — load each PNG into a Movie File In TOP, wire to a PBR MAT
Tips for best results
- Hold the phone stillduring the shutter tap — the flash-pair takes ~200 ms; the manifest's
gyroDriftRadiansfield surfaces how much drift the bake saw (below 0.5° / ~0.009 rad is fine) - Capture distance— 20–60 cm from the surface gives the best detail-to-coverage ratio at the default 20 cm patch size
- Lighting — ambient + the iPhone torch. Too-bright ambient washes out the controlled flash signal that drives the roughness estimate
- Best surfaces — wood, fabric, leather, brick, concrete, plaster, painted drywall, asphalt, ceramic tile (matte), unpolished plastic, paper, carpet, bark, stone — anything opaque, dielectric, and roughly diffuse
Limitations & v1.2 caveats
Surfaces where the pipeline degrades or fails: glass (LiDAR passes through), mirrors and chrome (captures the reflection, not the surface), polished metal (basecolor contaminated by reflections), skin / wax / translucent plastic (subsurface scattering not modeled), velvet and fur (anisotropic / sheen BRDFs not modeled), wet surfaces (captures the wet film).
v1.2 caveats (planned for v1.3+):
- Metallic is a uniform toggle (manifest only) — no per-texel
metallic.png. v1.3 adds CoreML SVBRDF inference - Roughness is a heuristic (
flashPairSpecularSharpness) — adjust the roughness scale slider to bias the result - Framing rect is auto-installed at 20 cm — drag handles to resize / reposition land in v1.3
- Single-shot only — multi-view capture for cleaner albedo planned for v1.3
Configuration
Settings Reference
Settings are split into focused sheets reachable from two places: Transmission Settings (tap the floating status bar pill at the top of the Camera / Streaming or ARKit Tracking page — Receiver, Transport, NDI, OSC, Point Cloud Stream, Protocol Info, and a Help section with Replay Tutorial) and mode-specific settings (tap the <Mode> Settings button just below the status bar — only the section relevant to the active mode). Color and Mono have no per-mode settings, so no button is shown. The receiver IP is shared across all transports. The reference below documents every setting; the headings indicate which sheet hosts each section.
Receiver — Transmission Settings
Receiver IP— 192.168.1.100IP address of the computer receiving streams. Shared across all transports (TCP, UDP, OSC, PLY).
Transport (TCP/UDP) — Transmission Settings
TCP/UDP Output— OnEnable or disable the video transport.
Protocol— TCPTCP (reliable, auto-reconnects) or UDP (fire-and-forget). TCP sends H.264 video for Color/Mono, raw Float32 depth for Depth/Point Cloud modes.
Port— 9847Destination port for TCP/UDP streams.
NDI — Transmission Settings
NDI Video Output— OffEnable or disable NDI streaming. Auto-broadcasts as "LOTA (iPhone)" on the local network.
Side-by-Side— OffSends a 2x-wide frame: left half is the camera view, right half is the depth colormap. Standard format for TouchDesigner and Notch workflows.
OSC — Transmission Settings
OSC Output— OffEnable or disable OSC messages.
Port— 9000OSC destination port.
Depth — Depth Settings sheet
Color Map— Visible SpectrumColormap for depth visualization. Options: Black & White, Black Aqua White, Blue Red, Deep Sea, Color Spectrum, Incandescent, Heated Metal, Sunrise, Visible Spectrum.
Point Cloud — Point Cloud Settings sheet
Frame Window— 30 (range 5–60)Number of accumulated LiDAR frames in the live point cloud sliding window. Higher values show more spatial coverage but use more GPU memory. Only affects the live view, not Gaussian Capture.
Max Depth— 5.0m (range 1–10m)Maximum LiDAR range. Points beyond this are discarded. Gen 1 LiDAR (iPhone 12–14 Pro) is reliable to ~5m. Gen 2 (iPhone 15–16 Pro) can reach ~10m. Affects both live view and Gaussian Capture exports.
Compute Quality— BalancedGPU compute frame skip for thermal management. Full = every frame, Balanced = every 2nd, Efficient = every 3rd. Only affects live Point Cloud mode.
Min Confidence— Medium+LiDAR depth confidence filter. All = no filtering, maximum density. Medium+ = removes low-confidence noisy edge pixels. High Only = fewest points, highest accuracy. Affects both live view and Gaussian Capture exports.
Blob Tracking — Detection (Blob Track Settings sheet)
Base Style— ColorWhat the underlying camera layer looks like behind the rectangle outlines. Color (live RGB), Mono (grayscale), Mask (grayscale subject silhouette on black), or Binary (white-on-black silhouette — authentic TouchDesigner Blob Track TOP look).
Draw Blob Bounds— OnDraw 1-pixel hairline rectangle outlines around detected blobs. Turn off to see only the base layer.
Show ID Labels— OffDraw #id text labels just outside the top-right corner of each blob bounding box.
Blob Color— Pure greenRGB color used for both rectangle outlines and ID labels.
Blob Tracking — Depth (Blob Track Settings sheet)
Min Depth— 0.5 m (range 0.1–5.0 m)Near edge of the depth slab. Pixels closer than this are excluded from blob detection.
Max Depth— 3.0 m (range 0.5–10.0 m)Far edge of the depth slab. Pixels farther than this are excluded from blob detection.
Min Confidence— Medium+LiDAR depth confidence filter (same semantics as Point Cloud). All / Medium+ / High Only. Pixels below the threshold are excluded from detection.
Blob Tracking — Constraints (Blob Track Settings sheet)
Min Blob Size— 50 px (range 10–500)Minimum pixel area to count as a blob. Filters out single noise pixels and tiny artifacts.
Max Blob Size— 30000 px (range 100–49152)Maximum pixel area. Rejects the "everything is one blob" failure mode where the entire scene merges together.
Max Move Distance— 0.15 (range 0.01–0.5)Maximum normalized distance a blob can travel between frames and keep its ID. Increase for fast-moving subjects or higher-FPS scenes.
Blob Tracking — Revival (Blob Track Settings sheet)
Revive Blobs— OnRe-identify lost blobs with the same ID if they reappear inside the revival window. Mirrors TouchDesigner's blob lifecycle exactly.
Revive Time— 0.5 sSeconds a lost blob is kept alive for revival before it transitions to expired and is permanently dropped.
Revive Distance— 0.2Maximum normalized distance for the revival match. A lost blob must reappear within this radius to recover its ID.
Include Lost in OSC— OffEmit lost blobs in the OSC bundle with state = 2.
Include Expired in OSC— OffEmit expired blobs in the OSC bundle with state = 3.
Point Cloud Stream (PLY) — Transmission Settings
PLY Streaming— OffEnable or disable live point cloud TCP stream.
Port— 9848PLY stream destination port.
Binary Format— Off (always leave on)Packed binary (15 bytes/point) vs CSV text. Binary is ~40% smaller, parses ~3× faster on the receiver, and is the format the supported LOTABinaryPLYRecieverV2.tox drop-in component targets — turn it on for every new capture. CSV text mode remains for backwards compatibility with custom receivers but is deprecated and will be removed in a future update.
Tracking — Body / Face / Hands Settings sheets
Skeleton Overlay— OnShow or hide body skeleton visualization on the tracking page.
Face Overlay— OnShow or hide face blend shape bar graph on the tracking page.
Hand Overlay— OnShow or hide hand bone and joint visualization on the tracking page.
3D Hand Coordinates— OffUse LiDAR-projected world-space hand coordinates instead of normalized screen-space. Requires a LiDAR-equipped device.
Transcription — Transcription Settings sheet
Send Per-Word OSC— OnSend each recognized word as it arrives (/lota/speech/word + TCP/UDP word frames). The /lota/speech/word_count integer always fires alongside per-word messages as a numeric CHOP-visible signal.
Send Partial Transcript— OnSend running partial transcript updates as words arrive (/lota/speech/partial + TCP/UDP partial frames).
Send Final Transcript— OnSend finalized sentences when the recognizer commits (/lota/speech/final + TCP/UDP final frames).
Device Motion — Motion Settings sheet
Acceleration (X, Y, Z)— OnGravity-removed acceleration in G-force, sent as three OSC floats per update (/lota/motion/accel/x, /y, /z).
Gyroscope (X, Y, Z)— OffRotation rate in rad/s, three OSC floats per update (/lota/motion/gyro/x, /y, /z).
Compass Heading— OffMagnetic heading in degrees 0–360 (requires Location permission). Address /lota/motion/heading.
Barometric Pressure— OffAtmospheric pressure in kPa plus relative altitude in meters (/lota/motion/pressure, /lota/motion/altitude).
Update Rate— 30 Hz30 / 60 / 100 Hz picker. 30 Hz matches ARKit. 100 Hz is useful for latency-critical controllers.
Audio Analysis — Audio Settings sheet
Levels (Bass / Mid / High)— OnContinuous 0–1 energy per frequency band with rolling auto-gain. /lota/audio/bass, /lota/audio/mid, /lota/audio/high.
Beat Detection— OffBinary 0/1 switch per band on detected onset (50 ms gate window). /lota/audio/drums/low, /lota/audio/drums/mid, /lota/audio/drums/high.
Dynamics— Off/lota/audio/burst — fast transient detector, pulse-shaped 0–1 (rises instantly, decays over ~200 ms).
FFT Spectrum (20 bands)— Off20 log-spaced FFT bands across 20–20,000 Hz, each normalized 0–1. /lota/audio/fft/0 through /lota/audio/fft/19, rendered with a red → blue color gradient.
Update Rate— 30 Hz30 / 60 Hz picker. Analysis runs internally at ~86 Hz and decimates to the chosen output rate.
NSMicrophoneUsageDescription as Transcription mode. LOTA can only analyze the microphone input — iOS sandboxing prevents capturing system audio from other apps.Inclusive Design
Accessibility
LOTA is designed to meet Apple's App Store accessibility standards. Every feature works for every user.
VoiceOver
Voice Control
stream, record, or switch to Depth and LOTA responds. Every button is discoverable by voice.Dynamic Type
Dark Interface
Differentiate Without Color
Increase Contrast
Reduce Motion
Support
Frequently Asked Questions
Which iPhones support LOTA?
Any iPhone running iOS 26.2 or later. LiDAR features (Depth, Point Cloud, Blob Track, Gaussian Capture, Material Capture, 3D hand coordinates) require iPhone 12 Pro or later. Color, Mono, Transcription, Motion, and Audio modes, NDI streaming, and TCP/UDP streaming all work on iPhones without LiDAR. Face tracking requires TrueDepth camera (iPhone X or later). Body tracking requires A12 chip or later. iPad Pro models with LiDAR are also supported.
Does Transcription mode need internet access?
No. Transcription uses iOS 26's on-device SpeechAnalyzerframework and runs entirely offline. After you grant Speech Recognition permission on first launch, LOTA prefetches the on-device speech model for your locale in the background — by the time you open Transcription mode the model is on disk, so first captions arrive without a download wait. If the prefetch didn't finish, the model installs lazily on first use as a fallback. Audio never leaves your device.
Do the receiving machine and iPhone need to be on the same network?
Yes. For TCP, UDP, OSC, and PLY streaming, both devices must be on the same local network. NDI also uses the local network but handles discovery automatically. A 5 GHz Wi-Fi connection is recommended.
What software can receive LOTA streams?
Any NDI-compatible software (TouchDesigner, OBS, vMix, Resolume), any tool that reads TCP/UDP sockets, and any OSC-capable application (Max/MSP, Ableton, Unreal Engine).
How do I use the export data for Gaussian Splat training?
Export a COLMAP or Nerfstudio dataset from the Gaussian Capture page, transfer the zip to your training machine, and point OpenSplat, Nerfstudio, or gsplat at the extracted folder. The files are in the exact format these tools expect.
Is there a latency cost to streaming?
NDI and TCP/UDP streams typically add 1–3 frames of latency depending on your network. OSC tracking data arrives at 30 Hz with sub-frame latency. A wired connection or 5 GHz Wi-Fi keeps things as fast as possible.
What export format should I use?
COLMAP for OpenSplat and gsplat. Nerfstudio for splatfacto and Instant-NGP. Nerfstudio + Depth for best geometry on flat or featureless surfaces. Point Cloud (PLY) for quick visualization in Blender or CloudCompare.