Introduction

Getting Started

LOTA turns your iPhone's LiDAR sensor into a professional spatial capture and streaming tool. Stream depth, color, and point cloud over the network in real time, capture datasets for 3D reconstruction, and stream motion capture, all with no extra hardware.

Requirements

• iPhone 12 Pro or later (LiDAR required for Depth, Point Cloud, Blob Track, Gaussian Capture, and Material Capture)
• Color, Mono, Neural Depth, Transcription, Motion, and Audio modes work on every iPhone, no LiDAR needed
• iOS 26.2 or later
• Wi-Fi network (for streaming features)

Reading this page: throughout these docs, the default value of every mode and setting is shown in green, so you can tell at a glance how LOTA behaves before you change anything.

Older devices: settings that can affect frame rate, GPU memory, or thermals on older iPhones carry a yellow Older devices tag. Lower these first if a capture stutters or the phone runs hot.

Install LOTA and complete first launch

Download LOTA from the App Store. On first launch, a welcome screen lists the seven iOS permissions LOTA needs (Camera, Microphone, Speech Recognition, Local Network, Location for compass heading, Motion & Fitness, and Photos add-only for saving local recordings). It also states plainly that LOTA never tracks you, sells your data, or sends anything off your device unless you point it at a receiver. Tap Continue and the seven OS permission dialogs appear in sequence. The Photos prompt uses the narrower add-only variant, so LOTA can write videos to your camera roll but never read existing photos.

Take the 10-step guided tour

After permissions, a welcome card and spotlight tour walk you through the three pages, the mode picker, the status bar, the bottom endpoint summary, mode-specific settings, and the transmit button. Tap anywhere to advance, or hit Skip at the bottom-left. Replay it anytime from Transmission Settings → Help → Replay Tutorial.

Choose a capture mode

Tap the mode picker at the top, a glass capsule showing the active mode's icon next to a chevron that flips as the panel opens and closes. The panel drops down with all 9 modes in a circular-icon grid (Color, Mono, Depth, Neural Depth, Point Cloud, Blob Track, Transcription, Motion, Audio). The active mode is highlighted yellow; LiDAR-required modes are dimmed and disabled on non-LiDAR devices. Each mode activates instantly, with no restart required.

Stream or record locally

The middle page has an iOS Camera-style shutter button at bottom-center with a STREAM | RECORD segmented control below it. The two are mutually exclusive: while one is active the other is locked, so a recording can't start mid-stream or vice versa. STREAM broadcasts over the network (all enabled transports send at once); RECORD saves an H.264 .mov of the live composited Metal output to your Photos library on stop (4-hour cap). Configure transports from the floating status bar pill (Transmission Settings), and set per-mode options from the <Mode> Settings button. Swipe right for Gaussian Capture (3D datasets, PBR materials, IMU and Audio Trace exports).

App Layout

Navigation

LOTA uses a three-page swipeable layout. Swipe left and right to move between pages.

ARKit Tracking (Swipe Left)

Body, face, and hand motion capture via ARKit. Streams skeleton, blend-shape, and hand-landmark data over OSC to TouchDesigner, Max/MSP, Ableton, and more.

Camera / Streaming (Center, default)

The main page. Live camera feed with nine capture modes (Color, Mono, Depth, Neural Depth, Point Cloud, Blob Track, Transcription, Motion, Audio) and streaming controls. Tap the mode picker at the top (glass capsule and chevron) to open the panel and switch modes. Tap the floating status bar pill for Transmission Settings, or the <Mode> Settingsbutton below it for the active mode's options. The streaming toggle is at bottom right.

Gaussian Capture (Swipe Right)

Record datasets for Gaussian Splatting and 3D reconstruction, capture a flat surface as a PBR material set with the Material format, or export IMU and Audio Trace files (sensor or audio analysis over time as CSV/TSV plus a manifest and optional PDF report). Choose a format, pick an iCloud folder, then tap record (3D-scene and trace formats) or lock a plane and tap the shutter (Material).

Capture

Capture Modes

LOTA has nine capture modes on the Camera / Streaming page. Tap the mode picker at the top, a glass capsule with the active mode's icon and a chevron, to open a panel of circular icon buttons. The active mode is highlighted yellow; LiDAR-required modes are dimmed on non-LiDAR devices. Tap outside the panel to dismiss it.

Color

No LiDAR required

Live RGB camera feed at 60 FPS. What you see on screen is exactly what gets streamed.

Mono

No LiDAR required

High-contrast grayscale feed optimized for low-light environments and precision spatial scanning.

Depth

LiDAR required

LiDAR depth visualization with 9 selectable colormaps including thermal, incandescent, deep sea, and visible spectrum.

Neural Depth

No LiDAR required

AI-estimated depth from the regular camera feed via Depth Anything V2 Small (ByteDance Research, packaged for Core ML by Apple), running on the Apple Neural Engine at ~30 ms / frame. Same nine colormaps as LiDAR Depth. Doubles as a fallback for NDI side-by-side on non-LiDAR phones. Fully on-device; nothing leaves the phone.

Point Cloud

LiDAR required

Streams the LiDAR point cloud over PLY. Every pixel of the 256×192 depth map is unprojected into 3D and sent over the network. The on-device view shows the live RGB camera, so geometry is visualized receiver-side. Configure max depth and compute quality in Settings.

Blob Track

LiDAR required

TouchDesigner-compatible blob tracker. Carves a configurable depth slab out of the LiDAR scan, finds connected regions, assigns each one a stable ID across frames, and streams per-blob metadata over OSC at TD-matching addresses. The full visualization (camera + outlines + ID labels) is captured by NDI.

Transcription

No LiDAR required

Live on-device speech-to-text with a mirrored bar waveform visualization. Recognized words stream out over OSC, TCP, UDP, and appear on screen as captions.

Motion

No LiDAR required

Turns the iPhone into a wireless motion-sensor OSC source. Streams accelerometer, gyroscope, compass heading, and barometric pressure as per-axis OSC channels. Each active value draws its own scrolling line graph on screen (TouchDesigner CHOP viewer aesthetic), captured by NDI.

Audio

No LiDAR required

Real-time microphone audio analysis using Apple's Accelerate/vDSP framework. Streams frequency-band Levels, per-band Beat Detection triggers, Dynamics bursts, and a 20-band FFT spectrum as OSC channels. Each active channel renders as a scrolling graph lane, captured by NDI.

Switching modes

Tap the mode picker at the top of the screen and select a cell from the panel, or say switch to Depth with Voice Control enabled. Switching is instant and does not interrupt an active stream.

Neural Depth mode

Select Neural Depth from the mode picker to estimate depth from the regular camera feed using Depth Anything V2 Small (ByteDance Research, packaged for Core ML by Apple, Apache 2.0, 24.8 M params). Inference runs entirely on the Apple Neural Engine at roughly 30 ms per frame, so nothing leaves the device.

Use Neural Depth when

You're on a non-LiDAR iPhone and want a depth visualization or NDI side-by-side feed
You want a comparison source alongside LiDAR for testing or stylistic reasons
You need depth in conditions where LiDAR struggles: very bright sunlight, very long range, glass or reflective surfaces

First selection

A centered glass card appears explaining the model is initializing. The first frame can take a moment as the Neural Engine warms up; subsequent frames are real-time. The card auto-fades when inference begins.

Color map and About the Model

The same nine colormaps as LiDAR Depth are available under Neural Depth Settings → Visualization. The About the Model section credits ByteDance Research and Apple, confirms inference runs fully on-device on the Neural Engine, and links to the Hugging Face model page.

NDI side-by-side fallback

Enabling NDI side-by-side on a phone without LiDAR now composites the regular camera on the left and Depth Anything V2 estimated depth on the right, so the side-by-side workflow is no longer LiDAR-only.

Blob Track mode

Select Blob Track from the mode picker to turn the iPhone into a wireless TouchDesigner-compatible blob tracker. LOTA carves a configurable depth slab out of the LiDAR scan, finds connected regions of in-range pixels, assigns each one a stable ID across frames, and streams the result as both a video (NDI) and per-blob metadata (OSC). It replaces a Kinect plus Blob Track TOP chain with a single iPhone: no background plate, no lighting calibration, and it works on a moving camera.

While active

Blob count HUD. A N blob(s) count appears in the status bar with a hex-grid icon
Live depth range. The current slab (e.g. 0.5m – 3.0m) is shown just below the status pill so operators can verify the active range without opening Settings
Hairline outlines. Each detected blob gets a 1-pixel rectangle drawn over the camera feed in your chosen color
NDI captures everything. The full composition (base layer, rectangles, and optional ID labels) is captured by NDI so receivers see exactly what the phone shows
Background detection. Connected-components labeling runs off the ARKit delegate thread so the main thread stays responsive for touch and UI

Base styles

Set in Blob Track Settings → Detection (tap theBlob Track Settings button below the status bar). Controls what the underlying camera layer looks like behind the rectangles.

Color (default). Live color camera feed. Operator-friendly view that shows what you're filming
Mono. Grayscale camera feed. Cleaner look, no color distractions
Mask.Grayscale subject silhouette on black. Verifies what's being detected and hides the background
Binary. Pure white silhouette on black. The authentic TouchDesigner Blob Track TOP look, maximum contrast

Toggle Draw Blob Bounds off to see only the base layer. Toggle Show ID Labels on to draw#1, #7, etc. labels just outside the top-right corner of each bbox. Both rectangles and labels use the Blob Color picker.

OSC addresses (TD-compatible)

Field names match TouchDesigner'sblobtrackTOP_Class verbatim. Every message is padded to 10 slots so CHOP channel counts stay stable as blobs come and go, and empty slots have id == 0 for filtering via a Select CHOP.

/lota/blob/count: active blob count (int)
/lota/blob/ids: 10 stable tracker IDs
/lota/blob/u, /lota/blob/v: normalized centroid (10 floats each)
/lota/blob/width, /lota/blob/height: normalized bbox size (10 floats each)
/lota/blob/tx, /lota/blob/ty: pixel centroid in 256×192 depth space (10 ints each)
/lota/blob/age: seconds tracked (10 floats)
/lota/blob/state: 0=new, 1=revived, 2=lost, 3=expired (10 ints)

Lifecycle states

A blob fires state = 0 (new)on first detection, then is silent while it's actively tracked. If it disappears, it transitions to lost and is held in the revival window for Revive Time seconds. If it reappears within Revive Distance of where it was last seen, it gets the same ID back and fires revived. Otherwise it transitions to expiredand is permanently dropped. This matches TouchDesigner's blob lifecycle exactly.

TCP / UDP fallback

In blob mode, TCP/UDP carries raw Float32 depth maps (the same format as the Depth capture mode), so receivers that want to do their own client-side analysis get the raw data. The structured per-blob metadata is OSC-only.

Transcription mode

Select Transcriptionfrom the mode picker to turn your iPhone into a wireless live speech-to-text source for creative tools. The camera view is replaced with a black canvas and a 200-bar mirrored waveform driven by the microphone. Recognized words appear on screen as live captions. It uses iOS 26's on-deviceSpeechAnalyzer framework: fast, private, and offline.

First-time setup

Microphone and Speech Recognition permissions are requested upfront on first launch (along with Camera and Location), and the on-device speech model for your language is prefetched in the background. By the time you open this mode for the first time, transcription is usually ready to go, with no model-download wait. If you denied either permission, the mode displays a message with a shortcut to iOS Settings to grant access. If the prefetch didn't finish or failed, the mode falls back to downloading on first use.

While active

Listening indicator. A pulsing microphone icon and "LISTENING" label appear in the status bar
Waveform. 200 mirrored bars pulse with your voice in real time and stream through NDI as a black-and-white video source
Live captions. Recognized words appear on screen as large centered text as they arrive
Camera tracking OSC suppressed. Camera position messages are automatically paused so speech data stands out in your OSC receiver
Clean exit. Leaving transcription mode stops the audio engine completely. No background microphone access.

OSC addresses

When OSC streaming is enabled, transcription sends these addresses:

/lota/speech/word: each recognized word (string)
/lota/speech/word_count: incrementing counter (int, visible in OSC In CHOP)
/lota/speech/partial: running partial transcript (string)
/lota/speech/final: finalized sentences (string)

String messages arrive in TouchDesigner's OSC In DAT (not OSC In CHOP, which only handles numeric channels). The word_count integer is the only speech message visible in OSC In CHOP, useful for signal-flow monitoring.

TCP / UDP binary wire format

For custom integrations, LOTA also sends recognized speech as binary frames over TCP and UDP:FrameType.speechText = 4, a 24-byte FrameHeader+ UTF-8 payload. The header's width field is repurposed for speech kind (0 = word, 1 = partial, 2 = final), height for word index. Use the drag-and-drop TD components below to parse this format automatically.

Which transport should you use?

OSC. Easiest for TouchDesigner, Max/MSP, Resolume. Drop in an OSC In DAT, set the port, done.
TCP / UDP. Use the LOTASpeechTCP / LOTASpeechUDP components (downloads in the TouchDesigner section below) for plug-and-play binary parsing, or integrate into custom apps, game engines, and scripts that don't have OSC parsers.
NDI. When you want the waveform visual as a live video source for reactive effects or broadcast overlays.

Motion mode

Select Motion from the mode picker to turn the iPhone into a wireless motion-sensor OSC source. LOTA reads the device motion sensors and streams them as OSC. Each active sensor value also renders as its own scrolling line graph lane on screen so operators can see the data in real time, and the full graph composition is captured by NDI. Works on every iPhone, no LiDAR required.

Sensors (individually toggleable in Settings)

Acceleration (ON by default). Gravity-removed, G-force units. Addresses /lota/motion/accel/x, /y, /z
Gyroscope. Rotation rate in rad/s. Addresses /lota/motion/gyro/x, /y, /z
Compass Heading. Degrees 0–360 (requires Location permission). Address /lota/motion/heading
Barometric Pressure. kPa plus relative altitude in meters from session start. Addresses /lota/motion/pressure and /lota/motion/altitude

Update rate picker: 30 / 60 / 100 Hz. 30 Hz matches ARKit. 100 Hz is useful for latency-critical controllers.

Permissions

On first entry to Motion mode, iOS prompts for Motion & Fitness access. If Compass is toggled on, an additional Location-When-In-Use prompt appears. CoreMotion data usesNSMotionUsageDescription; heading uses NSLocationWhenInUseUsageDescription. If denied, the relevant sensors silently skip without crashing (compass stays at −1, for example). Permissions are requested upfront when entering the mode, not mid-session.

Scrolling graph aesthetic

Each active sensor value gets its own horizontally stacked lane with a dim center line, a separator line, and a colored 2-character glyph label (AX,AY, AZ, GX, GY,GZ, HD, PR, AL) that follows the line's current Y position. Between sample arrivals, the graph slides left by a fractional sample-width (sub-sample scroll) so the visual feels continuous even at 30 Hz data rates. Rendered in Metal and captured by NDI.

Use cases

Phone as a wireless tilt / shake / toss controller for reactive visuals in TouchDesigner, Resolume, or Max/MSP
Environmental sensing: barometric pressure drift and altitude tracking for installations
Wearable motion source: a phone clipped or strapped to a performer streams tilt, rotation rate, and compass heading as OSC
Camera pose OSC is automatically suppressed in Motion mode so sensor channels stand out cleanly in an OSC In CHOP

Audio mode

Select Audio from the mode picker to turn the iPhone into a wireless real-time audio analysis source. LOTA taps the microphone via AVAudioEngineand extracts musically relevant features using Apple's Accelerate/vDSP framework, with no third-party dependencies and all processing on-device. Each active channel renders as a scrolling graph lane and is captured by NDI. Works on every iPhone, no LiDAR required.

Four channel groups (individually toggleable)

Levels (ON by default). Continuous 0–1 energy per frequency band with rolling auto-gain. /lota/audio/bass, /lota/audio/mid, /lota/audio/high
Beat Detection. Binary 0/1 switch per band on detected onset (holds at 1.0 for 50 ms after each hit). /lota/audio/drums/low, /lota/audio/drums/mid, /lota/audio/drums/high
Dynamics. Fast-vs-slow envelope difference, pulse-shaped 0–1 (rises instantly on transient, decays over ~200 ms). /lota/audio/burst
FFT Spectrum. 20 log-spaced frequency bands across 20–20,000 Hz, each normalized 0–1 via per-bin rolling max. /lota/audio/fft/0 through /lota/audio/fft/19

Update rate: 30 / 60 Hz. FFT and analysis run internally at ~86 Hz and decimate to the user's output rate.

Frequency bands

BASS (20–200 Hz): kick, sub-bass, bass guitar fundamentals
MID (200–2000 Hz): vocals, snare body, guitars, most instruments
HIGH (2000–8000 Hz): snare attack, hi-hats, vocal consonants, cymbals

Beat detection algorithm

Multi-stage onset detection designed to be accurate on dense, dynamic music without false-triggering on silence:

1. Log-magnitude spectral flux per band (more perceptually aligned than raw flux)
2. One-pole lowpass smoothing (~15 Hz cutoff)
3. Peak-picking on local maxima, which catches rapid drum fills that threshold-only methods miss
4. Adaptive threshold noise_floor + k·stddev where k is inversely modulated by coefficient of variation (dense music gets lower k, steady signals get higher k)
5. 20th-percentile noise-floor tracking over ~3 s (robust to sustained beats)
6. Silence energy gate per band, so no beats fire when the band itself is quiet
7. 50 ms minimum inter-onset time, which supports up to ~1200 BPM / 32nd notes

Drum channels are binary (0 or 1), trivial to route into a TouchDesigner Trigger CHOP or Logic CHOP without thresholding.

Permissions & limitations

On first entry to Audio mode, iOS prompts for Microphone access. If denied, a clear overlay with an "Open Settings" deep-link appears and no audio engine starts.

System audio limitation:LOTA analyzes the microphone input only. It cannot capture system audio (Spotify, Apple Music, YouTube) because iOS sandboxing doesn't expose system audio to third-party apps. To analyze music playing on the iPhone, play it through the speakers and LOTA's mic picks it up naturally. For high-fidelity analysis, play audio from a separate source into the iPhone's mic.

TouchDesigner pattern

Drop an OSC In CHOPand point it at the phone's IP on port 9000. Enable Levels (default) and channels bass, mid, high appear and start pulsing. Toggle Beat Detection on and drums_low, drums_mid, drums_high appear as clean 0/1 signals you can route into a Trigger CHOP for beat-synced effects. FFT bands render with a red-to-blue color gradient (F0 = lowest, F19 = highest) so you can see where energy is on the spectrum at a glance.

Network

Streaming and local recording

The middle page has an iOS Camera-style shutter button at bottom-center with a STREAM | RECORD segmented control below it that picks which paradigm the shutter triggers. The two are mutually exclusive: while one is running the other is locked, so you can't accidentally start a recording mid-stream or vice versa. Tap the floating status bar pill at the top to open Transmission Settings, a focused sheet with This Device (your iPhone's active IPs), Receiver, TCP/UDP, NDI, OSC, Point Cloud Stream, and Protocol Info sections. While streaming, each per-protocol chip in the status bar (OSC, NDI, TCP/UDP, PLY) goes from dim white to red, and a centered table at the bottom of the screen lists every active transmission with its <host>:<port> (plus the NDI source name LOTA - <deviceLabel>) and a footer line with the iPhone's own active IPs (e.g. This iPhone: 192.168.1.42 (Wi-Fi)), so you can verify the whole wiring at a glance without opening the sheet. In landscape, the shutter and segmented control stay pinned to the device's physical bottom edge, where they sit in portrait, matching iOS Camera. Rotate the phone and the island slides to the screen edge matching that physical position, while all other UI rotates naturally.

STREAM vs. RECORD

STREAM. Tap the shutter to start or stop network streaming. All enabled transports (TCP/UDP, NDI, OSC, PLY) send at once.
RECORD. Tap the shutter to start or stop a local recording of the live composited Metal output, saved as an H.264 .movto your Photos library on stop. Available in every camera-using mode (Color, Mono, Depth, Neural Depth, Point Cloud, Blob Track) and disabled in Transcription, Motion, and Audio, where there's no camera content to capture. 4-hour cap per recording. It uses PHPhotoLibrary add-only access, so LOTA can write videos but never read your existing photos.

NDI

Industry-standard video-over-IP. LOTA broadcasts as LOTA - <Device Label>, where the label is a random 4-char ID by default (e.g. LOTA - 7F3A) and can be customized in the Receiver section to anything like LOTA - Stage Left. The stream is auto-discovered by TouchDesigner, OBS, vMix, Resolume, and any other NDI-compatible receiver, and per-device labels mean multiple LOTA phones in the same venue show up as distinct sources. Optional side-by-side mode sends a 2x-wide frame with the camera view on the left and depth colormap on the right. It uses LiDAR depth on Pro phones and Depth Anything V2 estimated depth on non-LiDAR phones, so the side-by-side workflow is no longer LiDAR-only.

TCP / UDP

Streams H.264-encoded video for Color and Mono modes, and raw Float32 depth maps for Depth and Point Cloud modes. TCP reconnects automatically; UDP is fire-and-forget. Set the destination IP and port in Settings.

OSC

Streams real-time camera tracking data at ~30 Hzover UDP. Point any OSC-capable tool at LOTA's IP and port to receive:

/lota/camera/position: x, y, z (3 floats)
/lota/camera/rotation: quaternion x, y, z, w (4 floats)
/lota/camera/euler: pitch, yaw, roll (3 floats)
/lota/mode: current capture mode (1 Hz)
/lota/fps: current frame rate (1 Hz)

PLY (live point cloud)

Sends live point cloud frames over TCP in a fixed packed binary format: UInt32 LE point count followed by N × (3 Float32 XYZ + 3 UInt8 RGB) = 15 bytes/point. Default port 9848. Pair with the LOTABinaryPLYRecieverV2.tox drop-in TouchDesigner component (section below) for plug-and-play setup. The legacy CSV-text variant and its toggle were removed in v1.2.3; binary is now the only supported wire format.

How to start streaming

Open Transmission Settings (tap the status bar pill)

Tap the floating status bar pill at the top of the screen. Enable the protocols you need and set the destination IP (shared across all transports) and port for each. NDI requires no configuration.

Connect to the same Wi-Fi network

LOTA and your receiving machine must be on the same local network. A 5 GHz network is recommended for lowest latency.

Set STREAM (bottom-center) and tap the shutter

Make sure the segmented control under the shutter is on STREAM (not RECORD), then tap the shutter. All enabled protocols start simultaneously. The status bar pill shows a red chip for each active protocol; the bottom endpoint summary lists their host:port destinations.

Motion Capture

ARKit Tracking

Swipe left from the main camera page to access ARKit motion capture. Three tracking modes stream data over OSC in real time.

Body Tracking

No LiDAR required

3D skeleton detection via the rear camera. Tracks 91 ARKit joints, streams 18 key joints over OSC. Visual overlay shows white bones with green joint dots. Requires A12 chip or later.

/lota/body/skeleton: 18 joint positions
/lota/body/root: root transform
/lota/body/detected: detection state

Face Tracking

No LiDAR required

52 facial blend shapes captured via the front-facing TrueDepth camera. Overlay shows a bar graph of the top 8 most active blend shapes. Requires TrueDepth camera (iPhone X or later).

Each blend shape is sent as its own named OSC address (e.g. /lota/face/browDown_L, /lota/face/eyeSquint_R) bundled in a single UDP datagram.

Hand Tracking

No LiDAR required

Detects up to 2 hands simultaneously via the rear camera using the Vision framework. 21 landmarks per hand, streamed over OSC organized by finger. The overlay shows bone chains with joint dots, teal for the left hand and orange for the right.

/lota/hand/{left|right}/wrist: 3 floats
/lota/hand/{left|right}/thumb: 12 floats
/lota/hand/{left|right}/index: 12 floats
/lota/hand/{left|right}/middle: 12 floats
/lota/hand/{left|right}/ring: 12 floats
/lota/hand/{left|right}/pinky: 12 floats

Hand coordinate modes

2D (default). Normalized screen-space coordinates, works on all iPhones. 3D (opt-in). World-space coordinates projected via LiDAR depth, so it requires a LiDAR device. Toggle it in Hands Settings → 3D Hand Coordinates (tap the Hands Settings button below the status bar on the ARKit Tracking page).

Integration

TouchDesigner

LOTA is built with TouchDesigner workflows in mind. Stream live depth, color, and point cloud data directly into your TD project with no capture cards or expensive rigs.

LOTABinaryPLYRecieverV2.tox: Binary Point Cloud Receiver

Drop-in TouchDesigner receiver for LOTA's binary PLY stream. It auto-detects the binary header and only asks for the receiver port and a Script TOP target. Uses numpy bulk parsing and GPU instancing to handle 49K+ points at 60 fps. Enable Binary Format in Transmission Settings → Point Cloud Stream (tap the floating status bar pill) and point this component at the same port.

Download .tox

LOTASpeechTCP.tox: Speech TCP Receiver

TCP/IP DAT with a callback script that parses LOTA's binary speech frame format and writes recognized words, partials, and finals to a speech_log Table DAT. Handles stream buffering for frames split across TCP packets. Drop it into your network and recognized speech starts populating rows, with no manual parsing required.

Download .tox

LOTASpeechUDP.tox: Speech UDP Receiver

UDP In DAT configured for "One Per Message" mode with a callback script that parses the same binary speech frame format as the TCP receiver. Simpler than TCP: each datagram is a complete frame, so no buffering is needed. Use this for low-latency fire-and-forget speech delivery on a local network.

Download .tox

NDI input (easiest)

Drop an NDI InTOP into your project. LOTA appears in the source list automatically. Select it, and you're receiving live video.

Point cloud via LOTABinaryPLYRecieverV2.tox

Drop LOTABinaryPLYRecieverV2.tox into your TouchDesigner network.
In LOTA, enable PLY Streaming and Binary Formatin Transmission Settings → Point Cloud Stream and note the port.
On the component, set the receiver port to match and wire its output Script TOP into your render network, and live 3D geometry is ready to instance.

Camera tracking via OSC

Add an OSC InCHOP and set the port to match LOTA's OSC port. You'll receive camera position (x, y, z), rotation (quaternion), and euler angles at 30 Hz, perfect for driving a virtual camera or triggering effects based on device movement.

3D Export

Export & 3D Pipelines

Swipe right to the Gaussian Capture page. Record datasets with ARKit intrinsics, extrinsics, and LiDAR point clouds, ready for training, viewing, or post-production.

COLMAP

LiDAR required

Exports cameras.bin, images.bin, and points3D.bin in COLMAP binary format. Compatible with OpenSplat, gsplat, and Nerfstudio for training Gaussian Splats directly from your iPhone captures.

Nerfstudio

LiDAR required

Exports transforms.json, images/ JPEGs, and points3D.ply. Ready for Nerfstudio, splatfacto, and Instant-NGP training pipelines.

Nerfstudio + Depth

LiDAR required

Same as Nerfstudio, plus 16-bit PNG depth maps in a depth/ folder. Best for depth-supervised training, since it produces better geometry on flat surfaces and uniform areas.

Point Cloud (PLY)

LiDAR required

Standalone points3D.ply file with accumulated points from your session. Open in Blender, CloudCompare, MeshLab, or any tool that reads PLY. The PLY Format toggle in Capture Settings writes either a compact binary_little_endian file (default) or a human-readable ASCII text PLY.

Material (PBR)

LiDAR required

Single-shot capture of a flat surface as a complete PBR material set. ZIP contains basecolor.png (sRGB), normal.png, height.png (16-bit), ao.png, roughness.png, preview.png, and a material.json manifest with patch size and tiling hints. Drop into Substance Designer/Painter, Blender, Unreal, Unity, or TouchDesigner.

IMU Trace

No LiDAR required

Records device motion sensors (acceleration, gyroscope, compass heading, barometric pressure / relative altitude) over time. ZIP contains recording.csv or .tsv (one row per sample, first column t_seconds), manifest.json describing every column, and README.txt. Optional multi-page dark-theme PDF report with per-channel charts (X/Y/Z lines, |A| magnitude, compass polar rose, pressure & altitude) and a statistics table.

Audio Trace

No LiDAR required

Records audio analysis frames (bass/mid/high levels, drum onsets, dynamics, 20-band FFT spectrum) over time using the same mic engine as Audio mode. Same ZIP shape as IMU Trace; optional PDF report with FFT spectrogram heatmap, levels traces, beat-events timeline, dynamics burst, and statistics.

PLY Viewer

No LiDAR requiredNew in v1.2.6

Read-only. Inspect any saved .ply point cloud on the device, whether a LOTA capture or a file dragged in via the Files app, in an interactive 3D viewer. Selecting PLY Viewer pauses the ARSession (saves battery and thermals), swaps the page to a dark canvas with orbit controls and a neon axis gizmo, and morphs the shutter into a folder glyph that opens the file picker. Loads are streamed off the main thread; clouds are capped at 5 M points with uniform sub-sampling. 3D Gaussian Splat .ply files are detected automatically and rendered as splats.

ZIP naming: every export ZIP is named LOTA_<timestamp>_<MODE>.zip (e.g. LOTA_2026-04-30_14-22-08_IMU.zip) so the format is identifiable from the filename alone.

What happens during recording

Mesh overlay. A semi-transparent wireframe shows scanned surfaces building up in real time (cyan near, purple far)
Keyframe selection. Only frames where the camera moved at least 5cm or rotated ~5° are saved, producing a well-distributed set of training views
Blur detection. Motion-blurred frames are automatically rejected via Laplacian variance analysis
Focus lock. Autofocus is disabled during recording to keep camera intrinsics consistent across all frames
Haptic feedback. A subtle tap each time a keyframe is captured
Counters. Elapsed time, keyframe count, and total point count shown on screen

When you stop recording

1. Dataset files are written (metadata, point cloud, images)
2. Everything is compressed into a single .zip file
3. A summary shows format, keyframes, points, file size, and filename
4. The zip syncs to iCloud automatically if you chose an iCloud folder

Tips for best results

Move slowly

Walk around the subject at a steady pace. A typical 30-second scan produces ~80 keyframes and a 10–20 MB zip.

Capture from multiple angles

Shoot from low, mid, and high heights. Ensure adjacent viewpoints have 70–80% overlap for best reconstruction.

Use good lighting

Consistent, well-lit environments produce better results. Avoid reflective and transparent surfaces, since LiDAR struggles with glass and mirrors.

Capture Settings

New in v1.2.6

The 3D-scene recording formats (COLMAP, Nerfstudio, Nerfstudio + Depth, Point Cloud PLY) share a Capture Settings sheet, opened via the settings pill below the format picker. Replaces the previous "No settings for this mode" placeholder.

Capture Rate30 Hz / 15 Hz / ~7 Hz

iPhone LiDAR runs at 30 Hz. LOTA filters out duplicate-buffer ARFrames first (ARKit hands back the same depth buffer on every other 60 Hz callback), then strides on the unique scans, so the picker reflects real LiDAR scans per second. 30 Hz keeps every scan (densest), 15 Hz takes every other, ~7 Hz every fourth. Lower rates skip per-pixel unprojection and JPEG encoding for the dropped scans, roughly halving or quartering the captured point count and zip size. Useful when you're scanning a small subject for a Gaussian-splat training set and don't need every scan, or when throttling a long scan to keep the device cool.

PLY FormatBinary / ASCIINew in v1.2.7

Shown only when the Point Cloud format is selected. Binary (binary_little_endian) is the default: compact, fast, and read natively by OpenSplat, COLMAP, MeshLab, and CloudCompare. ASCII writes a human-readable text PLY, one x y z r g b line per point, easier to inspect in an editor or parse in a custom script but roughly 3× larger before zipping. ASCII artifacts are named LOTA_<timestamp>_PLY_ASCII.zip so the encoding is obvious from the filename.

Material Capture

Select Material from the format picker to swap the page from a recording flow to a plane-lock and shutter flow. One tap, ~½-second bake on iPhone 15/16 Pro at 1024², ZIP saved to your iCloud folder. Requires LiDAR.

When to use Material vs the other formats

For a 3D scan of an object or room you can move around, use COLMAP, Nerfstudio, or Point Cloud
For a tileable PBR texture of a flat surface (floor, wall, table, fabric, brick), use Material

Pick Material from the format picker

The page swaps to the material-capture UI: status icon, plane-status chip, Lock Plane action, and shutter button.

Pick an export folder

Tap the folder icon (bottom left) to choose an iCloud Files folder if you haven't already.

(Optional) Open Material Settings

Adjust output resolution (512 / 1024 / 2048), AO sample count (32 / 64 / 128), normal convention (OpenGL or DirectX), delight strength, and roughness scale. All persist via UserDefaults.

Point at a flat surface and tap Lock Plane

ARKit raycast detects horizontal and vertical planes. Lock Plane snaps a 20 cm screen-aligned square patch centered on what you're pointing at. In the output, "top" is away from the camera and "right" is the camera's right.

Tap the shutter

The torch fires a brief flash-pair sequence (~200 ms), autofocus locks for both grabs so they share the same focal distance, then the bake runs (~220 ms at 1024² with 64 AO samples on iPhone 15/16 Pro).

Save to Files

The Material Save Summary sheet shows a live PBR sphere preview, file-size estimate, and a metallic toggle. Tap Save: the ZIP is written to your iCloud folder, the sheet auto-dismisses, and the plane lock stays active for the next shot.

Bake performance

Measured on iPhone 15 / 16 Pro at 1024² output:

AO 32 samples: ~150 ms
AO 64 samples (default): ~220 ms
AO 128 samples: ~400 ms
2048² output: roughly 4× the time

What's in the ZIP

basecolor.png: sRGB 8-bit albedo, white-balanced and de-lit via flash-pair specular subtraction
normal.png: linear 8-bit tangent-space normal from the LiDAR depth gradient (OpenGL +Y up by default; DirectX selectable in Material Settings)
height.png: linear 16-bit, plane-relative ±25 mm range mapped to UInt16
ao.png: linear 8-bit horizon-based ambient occlusion baked from the height map
roughness.png: linear 8-bit per-texel estimate from flash-pair specular sharpness, multiplied by your roughness scale slider
preview.png: sRGB 8-bit Cook-Torrance BRDF sphere render of the captured material
material.json: manifest with planeMeters (physical patch size), tilingHintFor1mSquare, normal convention, roughness method, metallic value, device hardware identifier, LiDAR generation, capture date, and gyro drift between flash pair

Receiver workflows

Substance Designer. Drop the unzipped folder onto a new graph; Substance auto-creates a Material node with all six channels wired
Substance Painter. Drag the folder into Shelf → Materials, then apply to a mesh
Blender. In the Shading editor, add an Image Texture per file and wire to a Principled BSDF; set non-color spaces on normal, roughness, ao, and height
Unreal Engine. Drop the folder into the Content Browser and Unreal auto-creates a Material Instance. Switch normal convention to DirectX in Material Settings before capture so the green channel reads correctly
Unity. Standard URP / HDRP Lit shader inputs map 1:1
TouchDesigner. Load each PNG into a Movie File In TOP, wire to a PBR MAT

Tips for best results

Hold the phone stillduring the shutter tap. The flash-pair takes ~200 ms, and the manifest's gyroDriftRadiansfield surfaces how much drift the bake saw (below 0.5° / ~0.009 rad is fine)
Capture distance.20–60 cm from the surface gives the best detail-to-coverage ratio at the default 20 cm patch size
Lighting. Ambient light plus the iPhone torch. Too-bright ambient washes out the controlled flash signal that drives the roughness estimate
Best surfaces. Wood, fabric, leather, brick, concrete, plaster, painted drywall, asphalt, ceramic tile (matte), unpolished plastic, paper, carpet, bark, stone: anything opaque, dielectric, and roughly diffuse

Limitations & v1.2 caveats

Surfaces where the pipeline degrades or fails: glass (LiDAR passes through), mirrors and chrome (captures the reflection, not the surface), polished metal (basecolor contaminated by reflections), skin / wax / translucent plastic (subsurface scattering not modeled), velvet and fur (anisotropic / sheen BRDFs not modeled), wet surfaces (captures the wet film).

v1.2 caveats (planned for v1.3+):

Metallic is a uniform toggle (manifest only), with no per-texel metallic.png. v1.3 adds CoreML SVBRDF inference
Roughness is a heuristic (flashPairSpecularSharpness); adjust the roughness scale slider to bias the result
Framing rect is auto-installed at 20 cm; drag handles to resize and reposition land in v1.3
Single-shot only; multi-view capture for cleaner albedo is planned for v1.3

IMU Trace and Audio Trace

Pick IMU Trace or Audio Tracefrom the format picker to record sensor or audio analysis data over time as a structured file you can hand to a data science / motion / music workflow. The right page swaps to the matching live visualization (the same scrolling lane graphs as the middle page's Motion / Audio modes) while the trace is being collected.

Pick the format

Tap IMU Trace or Audio Trace in the format picker.

Set the export folder

Tap the folder icon (bottom left) to choose an iCloud Files folder if you haven't already.

Open IMU/Audio Trace Settings

Choose which sensor channels to include, the file format (CSV / TSV), whether to bundle a PDF report, and an optional auto-stop duration cap (5–600 s).

Tap the shutter to start

Sample counter and elapsed timer update at 10 Hz. The same scrolling lane visualization renders live so you can confirm signal is coming through.

Tap again to stop (or let it auto-stop)

Saves to your iCloud folder. A save-summary sheet shows file size and sample count.

What's in the ZIP

recording.csv or .tsv: one row per sample. The first column is t_seconds; the rest are described in the manifest
manifest.json: recording metadata (device, version, timestamps, sample count, sample rate, stop reason) plus a per-column dictionary (name, unit, dtype, description). Device names are mapped from sysctl hardware IDs to consumer names (e.g. iPhone 17 Pro Max)
README.txt: one-paragraph human description of the bundle layout
report.pdf (optional): multi-page dark-theme report with cover, per-channel charts, and a statistics table

PDF report contents

IMU Trace: Cover · Accelerometer X/Y/Z line · |A| magnitude · Gyroscope X/Y/Z line · Compass heading polar rose (if heading recorded) · Pressure & altitude (if pressure recorded) · Per-channel statistics
Audio Trace: Cover · FFT spectrogram (20 bands × time, viridis heatmap) · Bass/Mid/High levels · Beat events timeline (per band) · Dynamics burst trace · Per-channel statistics

Charts that have no data to plot show a centered "No data captured" placeholder so the recipient can tell an empty page is intentional, not broken (typical for an Audio Trace recorded in a silent room, where the FFT charts populate and the beat-events page shows the placeholder).

When to use these

Capture a short walk to plot device motion against ground-truth GPS in a Jupyter notebook
Record a song through the mic to extract beat onsets for an Ableton MIDI workflow
Sample IMU data from a moving rig and ship it to TouchDesigner / Blender as a baked animation source
Build a training dataset for sensor-fusion or audio-classification models

PLY Viewer

New in v1.2.6

Select PLY Viewer from the format picker to open the read-only point cloud inspector. ARKit pauses (the live camera goes black), and the right page swaps to a dark canvas with the bottom shutter morphed into a folder glyph. Tap the folder to open the file picker (filtered to .ply via the new dev.lota.ply UTI), choose any PLY on the device (LOTA captures, iCloud Drive, or third-party sources), and the cloud loads off the main thread. The viewer handles two kinds of .ply: a plain point cloud renders as flat points, while a 3D Gaussian Splat file is detected automatically and rendered as splats.

Once a cloud is loaded

Info chip. A centered top chip shows point count (thousand-separated, or a splat count for a Gaussian splat), AABB extents (in metres or centimetres), file size, and yellow subsampled / no RGB badges when applicable
Neon axis gizmo. Six axis caps in the top-trailing corner, projected from the live camera basis. Positives carry X/Y/Z letters; negatives are hollow rings (Blender convention). Tap any cap to spring-snap the camera to that view. The gizmo stays anchored to world axes regardless of mirror toggles
Folder button. The bottom shutter becomes a load-new-file button so you can swap clouds without leaving the mode

Gestures

One-finger drag: orbit yaw / pitch
Two-finger drag: pan the orbit target
Pinch: zoom (clamped to 0.05× / 8× of fit-bounds distance)
Double-tap: reset to fit-bounds with a 15° downward pitch

Color modes

Original. Per-point RGB from the file. Falls back to white if the file has no color
Solid. A single user-picked color, useful for inspecting shape without color noise
By Height. Y coordinate mapped through the selected palette; tall features pop visually
By Distance. Distance from the camera to each point, recomputed live as you orbit. Good for spotting depth structure; noisy clouds look like a fog when colored this way
By Axis. Same as By Height but along any of X / Y / Z. Lets you find the long axis of an asymmetric scan

The scalar modes (Height / Distance / Axis) share the nine LiDAR depth palettes.

Gaussian splats

New in v1.2.7

When the loaded file is a 3D Gaussian Splat (the layout OpenSplat and other 3DGS trainers write), the viewer detects it from the header and renders it with proper splatting: each Gaussian is an oriented, anisotropic, alpha-blended ellipse rather than a flat dot, depth-sorted on the GPU so it composites correctly from every angle.

The info chip shows a splat count instead of a point count
The settings sheet replaces the Color section and Point Size slider with a single Splat Size slider (0.25×–3.0×, default 1.0×). Below 1.0× thins the splat to reveal structure; above 1.0× fills gaps in a sparse splat. Splats carry their own baked color, so the point-cloud color modes do not apply
Orientation (camera mode, spin, Y-up / Z-up, mirrors, Reset View) works exactly as it does for a point cloud. OpenSplat results are not always Y-up, so if a splat loads tipped on its side, flip Y-up / Z-up first
View-dependent shading (specular glints from higher-order spherical harmonics) is not rendered yet, so the splat can look slightly flatter than it does in a desktop viewer
Capped at 2,000,000 Gaussians. Larger splats are uniformly sub-sampled and the chip shows the subsampled badge

Limits and behavior

5 M point cap. Larger files are uniformly sub-sampled and the chip shows a subsampled badge. Keeps memory predictable; a 5 M-point cloud needs ~120 MB of GPU buffers
Binary little-endian PLY only. Matches what LOTA writes and what most photogrammetry / DCC tools export. ASCII and big-endian PLY produce a clear "Unsupported PLY format" error
Extra properties (normals, alpha, intensity) are tolerated. The parser builds an offset map for whatever the file declares, so clouds from Blender, CloudCompare, and Nerfstudio load cleanly as long as x y z (and optionally red green blue) are present
Mirror toggles apply at draw time (model matrix), not at parse time, so toggling does not re-read the file
ARSession resumes as soon as you switch off the PLY Viewer format. Swiping between pages while a cloud is loaded keeps it in memory; no need to re-pick the file on return

Tips

For files saved from Blender or CloudCompare, the Y-up / Z-up setting in Orientation is the first thing to check if the cloud appears tipped on its side. LOTA's own captures are Y-up
The By Distance color mode is a quick "is this cloud noisy from the inside?" inspection, since noise looks like a fog when colored by distance from the orbit center
Auto Spin at ~0.2 rev/s is a good speed for casually presenting a capture without manual orbiting

Configuration

Settings Reference

Settings are split into focused sheets reachable from two places. Transmission Settings (tap the floating status bar pill at the top of the Camera / Streaming or ARKit Tracking page) holds Receiver, Transport, NDI, OSC, Point Cloud Stream, Protocol Info, a Help section with Replay Tutorial, and an Acknowledgements section listing third-party SDK notices. Mode-specific settings (tap the <Mode> Settings button just below the status bar) show only the section relevant to the active mode. Color and Mono have no per-mode settings, so no button is shown. The receiver IP is shared across all transports. The reference below documents every setting; the headings indicate which sheet hosts each section.

Each setting lists its default value in green next to the name.

This Device: Transmission Settings (top of sheet)

Active IPv4 addresseslive, read-only

Pinned to the top of Transmission Settings. Lists the iPhone's currently active IPv4 addresses with friendly interface labels: Wi-Fi (joined to a network), USB Hotspot (Personal Hotspot enabled while plugged into a Mac via USB, e.g. 172.20.10.1), Ethernet (USB-Ethernet adapter for multi-device wired rigs). Tap any row to copy the address with a haptic + brief "Copied" chip. The list updates live as you toggle Personal Hotspot, plug in / unplug USB-Ethernet, or join / leave Wi-Fi, with no need to re-open the sheet. These are the iPhone's own addresses, shown to help you set the right Receiver IP on your computer; not values to paste into LOTA's Receiver IP field.

Receiver: Transmission Settings

Receiver IP192.168.1.100

IP address of the computer receiving streams. Shared across all transports (TCP, UDP, OSC, PLY).

Device Labelrandom 4-char ID (e.g. 7F3A)

4–20 character ID used to build the NDI broadcast name LOTA - <label>. Random base36 default on first launch (e.g. LOTA - 7F3A); customizable to anything memorable like Stage Left, FOH, or Front Camera. Persists across launches. Critical for multi-device venues so each LOTA phone is distinguishable in TouchDesigner / OBS / NDI Studio Monitor. Changes take effect within ~1 s, with no need to stop and restart streaming.

Transport (TCP/UDP): Transmission Settings

TCP/UDP OutputOn

Enable or disable the video transport.

ProtocolTCP

TCP (reliable, auto-reconnects) or UDP (fire-and-forget). TCP sends H.264 video for Color/Mono, raw Float32 depth for Depth/Point Cloud modes.

Port9847

Destination port for TCP/UDP streams.

NDI: Transmission Settings

NDI Video OutputOff

Enable or disable NDI streaming. Auto-broadcasts as LOTA - <Device Label> (default LOTA - <random 4-char ID>, customizable in the Receiver section). Multiple LOTA devices on the same network show up as distinct sources thanks to per-device labels.

Side-by-SideOff

Sends a 2x-wide frame: left half is the camera view, right half is the depth colormap. Uses LiDAR depth on Pro phones; falls back to Depth Anything V2 estimated depth on non-LiDAR phones, so the side-by-side workflow is no longer LiDAR-only. Disabled on Transcription / Motion / Audio modes (no camera content to composite) with a one-line caption explaining why.

OSC: Transmission Settings

OSC OutputOff

Enable or disable OSC messages.

Port9000

OSC destination port.

Acknowledgements: Transmission Settings (bottom of sheet)

NDI® trademark noticestatic

Pinned to the bottom of Transmission Settings, below Help → Replay Tutorial. Reads NDI® is a registered trademark of Vizrt NDI AB. with a link to ndi.video. Required attribution for the NDI SDK that ships inside LOTA.

Depth: Depth Settings sheet

Color MapVisible Spectrum

Colormap for depth visualization. Options: Black & White, Black Aqua White, Blue Red, Deep Sea, Color Spectrum, Incandescent, Heated Metal, Sunrise, Visible Spectrum.

Neural Depth: Neural Depth Settings sheet

Color MapVisible Spectrum

Colormap for AI-estimated depth visualization. Same nine options as LiDAR Depth.

About the Model section credits ByteDance Research (model authors) and Apple (Core ML packaging), confirms inference runs on the Apple Neural Engine at ~30 ms / frame, and links to the Hugging Face model page. Apache 2.0; nothing leaves the device.

Point Cloud: Point Cloud Settings sheet

Max Depth5.0m (range 1–10m)Older devices

Maximum LiDAR range. Points beyond this are discarded. Gen 1 LiDAR (iPhone 12–14 Pro) is reliable to ~5m. Gen 2 (iPhone 15–16 Pro) can reach ~10m. Affects both the PLY stream and Gaussian Capture exports.

Compute QualityBalancedOlder devices

Rate at which unique LiDAR scans are processed and streamed over PLY (the sensor runs at 30 Hz). Full processes every scan (~30 Hz), Balanced every other (~15 Hz), Efficient every third (~10 Hz). Lower rates reduce GPU work and heat on long sessions. Affects the live PLY stream only; Gaussian Capture has its own rate dial.

Min ConfidenceMedium+

LiDAR depth confidence filter. All = no filtering, maximum density. Medium+ = removes low-confidence noisy edge pixels. High Only = fewest points, highest accuracy. Affects both live view and Gaussian Capture exports.

Gaussian Capture: Capture Settings sheet

Capture Rate30 Hz (30 / 15 / ~7)Older devices

Rate at which unique LiDAR scans are appended to the dataset. iPhone LiDAR runs at 30 Hz; LOTA filters duplicate-buffer ARFrames first, then strides on the unique scans. 30 Hz keeps every scan (densest), 15 Hz takes every other, ~7 Hz every fourth. Lower rates skip unprojection and JPEG encoding for the dropped scans, roughly halving or quartering point count and zip size. Useful for trading density for cost on long static holds or thermally constrained sessions. Applies to COLMAP, Nerfstudio, Nerfstudio + Depth, and Point Cloud PLY formats.

PLY FormatBinary (Binary / ASCII)

Point Cloud export encoding, shown only when the Point Cloud format is selected. Binary (binary_little_endian) is compact and read natively by OpenSplat, COLMAP, MeshLab, and CloudCompare. ASCII writes a human-readable text PLY (one x y z r g b line per point), easier to inspect or script against but roughly 3× larger before zipping; ASCII artifacts are named LOTA_<timestamp>_PLY_ASCII.zip.

PLY Viewer: PLY Viewer Settings sheet

Color ModeOriginal

Original (per-point RGB) / Solid (color picker) / By Height (Y coordinate) / By Distance (from camera, live as you orbit) / By Axis (signed distance along chosen X/Y/Z). Original on a colorless file falls back to white.

Paletteshared LiDAR palettes (9 presets)

Colormap for the scalar modes (By Height / By Distance / By Axis). Same nine presets as the LiDAR depth visualization.

Camera ModeFree Orbit

Free Orbit / Auto Spin / Top / Front / Side. Presets snap via the same animated path as tapping a gizmo cap.

Spin Speed0.2 rev/s (range 0.05–2.0)

Auto-rotation speed when Camera Mode is Auto Spin. Hidden in other modes.

Up AxisY-up

Y-up / Z-up segmented control. Matches the source file's convention. LOTA captures are Y-up; flip to Z-up for files saved from Blender or similar DCCs that expect Z-up.

Mirror X / Y / ZOff

Per-axis mirror toggles. Applied at draw time via the model matrix, so flipping does not re-read the file. The orientation gizmo stays anchored to world axes regardless.

Reset Viewaction

Re-runs fit-bounds with a 15° downward pitch, the same result as double-tapping the canvas.

Point Size3.0 (range 1.0–12.0)

Per-point screen size. All settings persist between launches.

Splat Size1.0× (range 0.25–3.0×)

Shown in place of the Color section and Point Size slider when the loaded file is a Gaussian splat. Scales every Gaussian: below 1.0× thins the splat to reveal structure, above 1.0× fills gaps in a sparse splat.

Blob Tracking: Detection (Blob Track Settings sheet)

Base StyleColor

What the underlying camera layer looks like behind the rectangle outlines. Color (live RGB), Mono (grayscale), Mask (grayscale subject silhouette on black), or Binary (white-on-black silhouette, the authentic TouchDesigner Blob Track TOP look).

Draw Blob BoundsOn

Draw 1-pixel hairline rectangle outlines around detected blobs. Turn off to see only the base layer.

Show ID LabelsOff

Draw #id text labels just outside the top-right corner of each blob bounding box.

Blob ColorPure green

RGB color used for both rectangle outlines and ID labels.

Blob Tracking: Depth (Blob Track Settings sheet)

Min Depth0.5 m (range 0.1–5.0 m)

Near edge of the depth slab. Pixels closer than this are excluded from blob detection.

Max Depth3.0 m (range 0.5–10.0 m)

Far edge of the depth slab. Pixels farther than this are excluded from blob detection.

Min ConfidenceMedium+

LiDAR depth confidence filter (same semantics as Point Cloud). All / Medium+ / High Only. Pixels below the threshold are excluded from detection.

Blob Tracking: Constraints (Blob Track Settings sheet)

Min Blob Size50 px (range 10–500)

Minimum pixel area to count as a blob. Filters out single noise pixels and tiny artifacts.

Max Blob Size30000 px (range 100–49152)

Maximum pixel area. Rejects the "everything is one blob" failure mode where the entire scene merges together.

Max Move Distance0.15 (range 0.01–0.5)

Maximum normalized distance a blob can travel between frames and keep its ID. Increase for fast-moving subjects or higher-FPS scenes.

Blob Tracking: Revival (Blob Track Settings sheet)

Revive BlobsOn

Re-identify lost blobs with the same ID if they reappear inside the revival window. Mirrors TouchDesigner's blob lifecycle exactly.

Revive Time0.5 s

Seconds a lost blob is kept alive for revival before it transitions to expired and is permanently dropped.

Revive Distance0.2

Maximum normalized distance for the revival match. A lost blob must reappear within this radius to recover its ID.

Include Lost in OSCOff

Emit lost blobs in the OSC bundle with state = 2.

Include Expired in OSCOff

Emit expired blobs in the OSC bundle with state = 3.

Point Cloud Stream (PLY): Transmission Settings

PLY StreamingOff

Enable or disable live point cloud TCP stream.

Port9848

PLY stream destination port.

Wire format (binary, fixed): UInt32 LE point count followed by N × (3 Float32 XYZ + 3 UInt8 RGB) = 15 bytes per point. Drop in LOTABinaryPLYRecieverV2.tox from the TouchDesigner section above for plug-and-play setup. The CSV-text variant and its toggle were removed in v1.2.3; binary is the only supported wire format. Custom receivers built against the legacy CSV stream will need to switch to the binary parser.

Tracking: Body / Face / Hands Settings sheets

Skeleton OverlayOn

Show or hide body skeleton visualization on the tracking page.

Face OverlayOn

Show or hide face blend shape bar graph on the tracking page.

Hand OverlayOn

Show or hide hand bone and joint visualization on the tracking page.

3D Hand CoordinatesOff

Use LiDAR-projected world-space hand coordinates instead of normalized screen-space. Requires a LiDAR-equipped device.

Transcription: Transcription Settings sheet

Send Per-Word OSCOn

Send each recognized word as it arrives (/lota/speech/word + TCP/UDP word frames). The /lota/speech/word_count integer always fires alongside per-word messages as a numeric CHOP-visible signal.

Send Partial TranscriptOn

Send running partial transcript updates as words arrive (/lota/speech/partial + TCP/UDP partial frames).

Send Final TranscriptOn

Send finalized sentences when the recognizer commits (/lota/speech/final + TCP/UDP final frames).

These toggles apply to both OSC and TCP/UDP transports simultaneously. Turn off anything you don't need to reduce noise in your receiver.

Device Motion: Motion Settings sheet

Acceleration (X, Y, Z)On

Gravity-removed acceleration in G-force, sent as three OSC floats per update (/lota/motion/accel/x, /y, /z).

Gyroscope (X, Y, Z)Off

Rotation rate in rad/s, three OSC floats per update (/lota/motion/gyro/x, /y, /z).

Compass HeadingOff

Magnetic heading in degrees 0–360 (requires Location permission). Address /lota/motion/heading.

Barometric PressureOff

Atmospheric pressure in kPa plus relative altitude in meters (/lota/motion/pressure, /lota/motion/altitude).

Update Rate30 HzOlder devices

30 / 60 / 100 Hz picker. 30 Hz matches ARKit. 100 Hz is useful for latency-critical controllers.

Each enabled sensor draws a scrolling graph lane on screen and streams over OSC. Toggling a sensor off mid-session removes its graph lane and stops its OSC channel within ~50 ms (debounced to prevent crash loops from rapid toggles). Permissions are requested upfront when entering Motion mode.

Audio Analysis: Audio Settings sheet

Levels (Bass / Mid / High)On

Continuous 0–1 energy per frequency band with rolling auto-gain. /lota/audio/bass, /lota/audio/mid, /lota/audio/high.

Beat DetectionOff

Binary 0/1 switch per band on detected onset (50 ms gate window). /lota/audio/drums/low, /lota/audio/drums/mid, /lota/audio/drums/high.

DynamicsOff

/lota/audio/burst: fast transient detector, pulse-shaped 0–1 (rises instantly, decays over ~200 ms).

FFT Spectrum (20 bands)Off

20 log-spaced FFT bands across 20–20,000 Hz, each normalized 0–1. /lota/audio/fft/0 through /lota/audio/fft/19, rendered with a red-to-blue color gradient.

Update Rate30 HzOlder devices

30 / 60 Hz picker. Analysis runs internally at ~86 Hz and decimates to the chosen output rate.

Microphone permission is requested upfront when entering Audio mode. Uses the same NSMicrophoneUsageDescription as Transcription mode. LOTA can only analyze the microphone input, since iOS sandboxing prevents capturing system audio from other apps.

Inclusive Design

Accessibility

LOTA is designed to meet Apple's App Store accessibility standards. Every feature works for every user.

VoiceOver

Every control is labeled and announced. Mode switches, streaming state, and recording status are all spoken aloud.

Voice Control

Say stream, record, or switch to Depth and LOTA responds. Every button is discoverable by voice.

Dynamic Type

Text scales up to 200%+. The UI reflows to a single-column layout at extreme sizes so nothing gets cut off.

Dark Interface

Designed dark from the start. Every screen, menu, and control uses a true dark color scheme, comfortable in any environment.

Differentiate Without Color

Status indicators swap to distinct symbols when this setting is enabled. Shapes, icons, and text labels carry the meaning, so color is never the only cue.

Increase Contrast

Swaps blur materials for solid backgrounds and boosts status colors for guaranteed readability over any camera feed.

Reduce Motion

All transitions respect the system Reduce Motion setting. Visual feedback stays, decorative animation goes.

Support

Frequently Asked Questions

Which iPhones support LOTA?

Any iPhone running iOS 26.2 or later. LiDAR features (Depth, Point Cloud, Blob Track, Gaussian Capture, Material Capture, 3D hand coordinates) require iPhone 12 Pro or later. Color, Mono, Neural Depth (Depth Anything V2 on the Apple Neural Engine), Transcription, Motion, and Audio modes, plus NDI and TCP/UDP streaming, all work on iPhones without LiDAR. Neural Depth also serves as the side-by-side NDI fallback, so multi-pane workflows run on any iPhone. Face tracking requires the TrueDepth camera (iPhone X or later); body tracking requires an A12 chip or later. LOTA is officially iPhone-only. iPad Pro models with LiDAR have run successfully in beta testing, but iPad is not an officially supported target, so your mileage may vary.

Does Transcription mode need internet access?

No. Transcription uses iOS 26's on-device SpeechAnalyzerframework and runs entirely offline. After you grant Speech Recognition permission on first launch, LOTA prefetches the speech model for your locale in the background, so by the time you open Transcription mode the model is on disk and the first captions arrive with no download wait. If the prefetch didn't finish, the model installs lazily on first use. Audio never leaves your device.

Do the receiving machine and iPhone need to be on the same network?

Yes. For TCP, UDP, OSC, and PLY streaming, both devices must be on the same local network. NDI also uses the local network but handles discovery automatically. A 5 GHz Wi-Fi connection is recommended.

What software can receive LOTA streams?

Any NDI-compatible software (TouchDesigner, OBS, vMix, Resolume), any tool that reads TCP/UDP sockets, and any OSC-capable application (Max/MSP, Ableton, Unreal Engine).

How do I use the export data for Gaussian Splat training?

Export a COLMAP or Nerfstudio dataset from the Gaussian Capture page, transfer the zip to your training machine, and point OpenSplat, Nerfstudio, or gsplat at the extracted folder. The files are in the exact format these tools expect.

Is there a latency cost to streaming?

NDI and TCP/UDP streams typically add 1–3 frames of latency depending on your network. OSC tracking data arrives at 30 Hz with sub-frame latency. A wired connection or 5 GHz Wi-Fi keeps things as fast as possible.

What export format should I use?

COLMAP for OpenSplat and gsplat. Nerfstudio for splatfacto and Instant-NGP. Nerfstudio + Depth for best geometry on flat or featureless surfaces. Point Cloud (PLY) for quick visualization in Blender or CloudCompare.