Introduction

Getting Started

LOTA turns your iPhone's LiDAR sensor into a professional spatial capture and streaming tool. Stream depth, color, and point cloud data over the network in real time, capture datasets for 3D reconstruction, and stream motion capture data — no extra hardware required.

Requirements

• iPhone 12 Pro or later (LiDAR required for Depth, Point Cloud, Blob Track, and Gaussian Capture)
• Color, Mono, Transcription, Motion, and Audio modes work on all iPhones
• iOS 26.2 or later
• Wi-Fi network (for streaming features)

Install LOTA and complete first launch

Download LOTA from the App Store. The first time you open the app, a welcome screen lists the four iOS permissions LOTA needs (Camera, Microphone, Speech Recognition, Location for compass heading) plus a privacy reassurance — LOTA doesn't track or store anything; captures stream over your local network or save to the iCloud folder you choose. Tap Continue and the four OS permission dialogs appear in sequence.

Take the 9-step guided tour

After permissions, a hybrid welcome card → spotlight tour walks you through the three pages, the mode picker, the status bar, mode-specific settings, and the transmit button. Tap anywhere to advance, or use the Skip button at the bottom-left. Replay anytime from Transmission Settings → Help → Replay Tutorial.

Choose a capture mode

Tap the mode picker at the top — a glass capsule showing the active mode's icon next to a 3×2 dot-grid glyph, mirroring the iOS Camera app's "more controls" affordance. A glass panel drops down with all 8 modes laid out in a 2×4 grid of circular icon buttons (Color, Mono, Depth, Point Cloud, Blob Track, Transcription, Motion, Audio). Active mode is highlighted yellow; LiDAR-required modes are dimmed and disabled on non-LiDAR devices. Each mode activates instantly — no restart required.

Start streaming or recording

Tap the transmit button (bottom right) to broadcast over the network — all enabled transports send simultaneously. Configure transports by tapping the floating status bar pill at the top (opens Transmission Settings). Tweak per-mode options via the <Mode> Settings button just under the status bar. Swipe right to access Gaussian Capture for recording 3D datasets or capturing PBR materials.

App Layout

Navigation

LOTA uses a three-page swipeable layout. Swipe left and right to move between pages.

ARKit Tracking — Swipe Left

Body, face, and hand motion capture via ARKit. Stream skeleton, blend shape, and hand landmark data over OSC to TouchDesigner, Max/MSP, Ableton, and more.

Camera / Streaming — Center (default)

The main page. Live camera feed with eight capture modes (Color, Mono, Depth, Point Cloud, Blob Track, Transcription, Motion, Audio) and streaming controls. Tap the mode picker at the top (glass capsule + 3×2 dot-grid glyph) to open the panel and switch modes. Tap the floating status bar pill for Transmission Settings, or the per-mode <Mode> Settings button just below it for options specific to the active mode. Streaming toggle is bottom right.

Gaussian Capture — Swipe Right

Record datasets for Gaussian Splatting and 3D reconstruction, or capture a flat surface as a complete PBR material set with the Material format. Choose a format, pick an iCloud folder, then either tap record (3D scene formats) or lock a plane and tap the shutter (Material).

Capture

Capture Modes

LOTA provides eight distinct capture modes on the Camera / Streaming page. Tap the mode picker at the top (glass capsule with the active mode's icon next to a 3×2 dot-grid glyph) to open a glass panel laid out as a 2×4 grid of circular icon buttons. Active mode is highlighted yellow; LiDAR-required modes are dimmed on non-LiDAR devices. Tap outside the panel to dismiss without picking.

Color

Live RGB camera feed at 60 FPS. What you see on screen is exactly what gets streamed. Works on all iPhones — no LiDAR required.

Mono

High-contrast grayscale feed optimized for low-light environments and precision spatial scanning. Works on all iPhones — no LiDAR required.

Depth

LiDAR depth visualization with 9 selectable colormaps including thermal, incandescent, deep sea, and visible spectrum. Requires a LiDAR-equipped iPhone.

Point Cloud

Real-time 3D point cloud rendered with true RGB colors. Every pixel of the 256×192 depth map is unprojected into 3D space. Configure frame window, max depth, and compute quality in Settings. Requires LiDAR.

Blob Track

TouchDesigner-compatible blob tracker. Carves a configurable depth slab out of the LiDAR scan, finds connected regions, assigns each one a stable ID across frames, and streams per-blob metadata over OSC at TD-matching addresses. The full visualization (camera + outlines + ID labels) is captured by NDI. Requires LiDAR.

Transcription

Live on-device speech-to-text with a mirrored bar waveform visualization. Recognized words stream out over OSC, TCP, UDP, and appear on screen as captions. Works on every iPhone — no LiDAR required.

Motion

Turns the iPhone into a wireless motion-sensor OSC source. Streams accelerometer, gyroscope, compass heading, and barometric pressure as per-axis OSC channels. Each active value draws its own scrolling line graph on screen (TouchDesigner CHOP viewer aesthetic), captured by NDI. Works on every iPhone — no LiDAR required.

Audio

Real-time microphone audio analysis using Apple's Accelerate/vDSP framework. Streams frequency-band Levels, per-band Beat Detection triggers, Dynamics bursts, and a 20-band FFT spectrum as OSC channels. Each active channel renders as a scrolling graph lane, captured by NDI. Works on every iPhone — no LiDAR required.

Switching modes

Tap the mode picker at the top of the screen and select a cell from the dot-grid panel, or say switch to Depth with Voice Control enabled. Switching is instant and does not interrupt an active stream.

Blob Track mode

Select Blob Track from the mode picker to turn the iPhone into a wireless TouchDesigner-compatible blob tracker. LOTA carves a configurable depth slab out of the LiDAR scan, finds connected regions of in-range pixels, assigns each one a stable ID across frames, and streams the result as both a video (NDI) and per-blob metadata (OSC). Replaces a Kinect + Blob Track TOP chain with a single iPhone — no background plate, no lighting calibration, works on a moving camera.

While active

Blob count HUD — A N blob(s) count appears in the status bar with a hex-grid icon
Live depth range — The current slab (e.g. 0.5m – 3.0m) is shown just below the status pill so operators can verify the active range without opening Settings
Hairline outlines — Each detected blob gets a 1-pixel rectangle drawn over the camera feed in your chosen color
NDI captures everything — The full composition (base layer + rectangles + optional ID labels) is captured by NDI so receivers see exactly what the phone shows
Background detection — Connected-components labeling runs off the ARKit delegate thread so the main thread stays responsive for touch and UI

Base styles

Set in Blob Track Settings → Detection (tap theBlob Track Settings button below the status bar). Controls what the underlying camera layer looks like behind the rectangles.

Color (default) — Live color camera feed. Operator-friendly view, see what you're filming
Mono — Grayscale camera feed. Cleaner look, no color distractions
Mask— Grayscale subject silhouette on black. Verifies what's being detected and hides the background
Binary — Pure white silhouette on black. Authentic TouchDesigner Blob Track TOP look — maximum contrast

Toggle Draw Blob Bounds off to see only the base layer. Toggle Show ID Labels on to draw#1, #7, etc. labels just outside the top-right corner of each bbox. Both rectangles and labels use the Blob Color picker.

OSC addresses (TD-compatible)

Field names match TouchDesigner'sblobtrackTOP_Class verbatim. Every message is padded to 10 slots so CHOP channel counts stay stable as blobs come and go — empty slots have id == 0 for filtering via a Select CHOP.

/lota/blob/count — active blob count (int)
/lota/blob/ids — 10 stable tracker IDs
/lota/blob/u, /lota/blob/v — normalized centroid (10 floats each)
/lota/blob/width, /lota/blob/height — normalized bbox size (10 floats each)
/lota/blob/tx, /lota/blob/ty — pixel centroid in 256×192 depth space (10 ints each)
/lota/blob/age — seconds tracked (10 floats)
/lota/blob/state — 0=new, 1=revived, 2=lost, 3=expired (10 ints)

Lifecycle states

A blob fires state = 0 (new)on first detection, then is silent while it's actively tracked. If it disappears, it transitions to lost and is held in the revival window for Revive Time seconds. If it reappears within Revive Distance of where it was last seen, it gets the same ID back and fires revived. Otherwise it transitions to expiredand is permanently dropped. This matches TouchDesigner's blob lifecycle exactly.

TCP / UDP fallback

In blob mode, TCP/UDP carries raw Float32 depth maps (the same format as the Depth capture mode), so receivers that want to do their own client-side analysis get the raw data. The structured per-blob metadata is OSC-only.

Transcription mode

Select Transcriptionfrom the mode picker to turn your iPhone into a wireless live speech-to-text source for creative tools. The camera view is replaced with a black canvas and a 200-bar mirrored waveform driven by the microphone. Recognized words appear on screen as live captions. Uses iOS 26's on-deviceSpeechAnalyzer framework — fast, private, offline.

First-time setup

Microphone and Speech Recognition permissions are requested upfront on first launch (along with Camera and Location), and the on-device speech model for your language is prefetched in the background. By the time you open this mode for the first time, transcription is usually ready to go — no model-download wait. If you denied either permission, the mode displays a message with a shortcut to iOS Settings to grant access. If the prefetch didn't finish or failed, the mode falls back to downloading on first use.

While active

Listening indicator — A pulsing microphone icon and "LISTENING" label appear in the status bar
Waveform — 200 mirrored bars pulse with your voice in real time and stream through NDI as a black-and-white video source
Live captions — Recognized words appear on screen as large centered text as they arrive
Camera tracking OSC suppressed — Camera position messages are automatically paused so speech data stands out in your OSC receiver
Clean exit — Leaving transcription mode stops the audio engine completely. No background microphone access.

OSC addresses

When OSC streaming is enabled, transcription sends these addresses:

/lota/speech/word — each recognized word (string)
/lota/speech/word_count — incrementing counter (int, visible in OSC In CHOP)
/lota/speech/partial — running partial transcript (string)
/lota/speech/final — finalized sentences (string)

String messages arrive in TouchDesigner's OSC In DAT (not OSC In CHOP — CHOP only handles numeric channels). The word_count integer is the only speech message visible in OSC In CHOP, useful for signal-flow monitoring.

TCP / UDP binary wire format

For custom integrations, LOTA also sends recognized speech as binary frames over TCP and UDP:FrameType.speechText = 4, a 24-byte FrameHeader+ UTF-8 payload. The header's width field is repurposed for speech kind (0 = word, 1 = partial, 2 = final), height for word index. Use the drag-and-drop TD components below to parse this format automatically.

Which transport should you use?

OSC — Easiest for TouchDesigner, Max/MSP, Resolume. Drop in an OSC In DAT, set the port, done.
TCP / UDP— Use the LOTASpeechTCP / LOTASpeechUDP components (downloads in the TouchDesigner section below) for plug-and-play binary parsing, or integrate into custom apps, game engines, and scripts that don't have OSC parsers.
NDI — When you want the waveform visual as a live video source for reactive effects or broadcast overlays.

Motion mode

Select Motion from the mode picker to turn the iPhone into a wireless motion-sensor OSC source. LOTA reads the device motion sensors and streams them as OSC. Each active sensor value also renders as its own scrolling line graph lane on screen so operators can see the data in real time — the full graph composition is captured by NDI. Works on every iPhone — no LiDAR required.

Sensors (individually toggleable in Settings)

Acceleration (ON by default) — Gravity-removed, G-force units. Addresses /lota/motion/accel/x, /y, /z
Gyroscope — Rotation rate in rad/s. Addresses /lota/motion/gyro/x, /y, /z
Compass Heading — Degrees 0–360 (requires Location permission). Address /lota/motion/heading
Barometric Pressure — kPa plus relative altitude in meters from session start. Addresses /lota/motion/pressure and /lota/motion/altitude

Update rate picker: 30 / 60 / 100 Hz. 30 Hz matches ARKit. 100 Hz is useful for latency-critical controllers.

Permissions

On first entry to Motion mode, iOS prompts for Motion & Fitness access. If Compass is toggled on, an additional Location-When-In-Use prompt appears. CoreMotion data usesNSMotionUsageDescription; heading uses NSLocationWhenInUseUsageDescription. If denied, the relevant sensors silently skip (compass stays at −1, etc.) — no crash. Permissions are requested upfront when entering the mode, not mid-session.

Scrolling graph aesthetic

Each active sensor value gets its own horizontally stacked lane with a dim center line, a separator line, and a colored 2-character glyph label (AX,AY, AZ, GX, GY,GZ, HD, PR, AL) that follows the line's current Y position. Between sample arrivals, the graph slides left by a fractional sample-width (sub-sample scroll) so the visual feels continuous even at 30 Hz data rates. Rendered in Metal and captured by NDI.

Use cases

Phone as a wireless tilt / shake / toss controller for reactive visuals in TouchDesigner, Resolume, or Max/MSP
Environmental sensing — barometric pressure drift and altitude tracking for installations
Wearable motion source — phone clipped or strapped to a performer streams tilt, rotation rate, and compass heading as OSC
Camera pose OSC is automatically suppressed in Motion mode so sensor channels stand out cleanly in an OSC In CHOP

Audio mode

Select Audio from the mode picker to turn the iPhone into a wireless real-time audio analysis source. LOTA taps the microphone via AVAudioEngineand extracts musically relevant features using Apple's Accelerate/vDSP framework — zero third-party dependencies, fully on-device. Each active channel renders as a scrolling graph lane and is captured by NDI. Works on every iPhone — no LiDAR required.

Four channel groups (individually toggleable)

Levels (ON by default) — Continuous 0–1 energy per frequency band with rolling auto-gain. /lota/audio/bass, /lota/audio/mid, /lota/audio/high
Beat Detection — Binary 0/1 switch per band on detected onset (holds at 1.0 for 50 ms after each hit). /lota/audio/drums/low, /lota/audio/drums/mid, /lota/audio/drums/high
Dynamics — Fast-vs-slow envelope difference, pulse-shaped 0–1 (rises instantly on transient, decays over ~200 ms). /lota/audio/burst
FFT Spectrum — 20 log-spaced frequency bands across 20–20,000 Hz, each normalized 0–1 via per-bin rolling max. /lota/audio/fft/0 through /lota/audio/fft/19

Update rate: 30 / 60 Hz. FFT and analysis run internally at ~86 Hz and decimate to the user's output rate.

Frequency bands

BASS (20–200 Hz) — kick, sub-bass, bass guitar fundamentals
MID (200–2000 Hz) — vocals, snare body, guitars, most instruments
HIGH (2000–8000 Hz) — snare attack, hi-hats, vocal consonants, cymbals

Beat detection algorithm

Multi-stage onset detection designed to be accurate on dense, dynamic music without false-triggering on silence:

1. Log-magnitude spectral flux per band (more perceptually aligned than raw flux)
2. One-pole lowpass smoothing (~15 Hz cutoff)
3. Peak-picking on local maxima — catches rapid drum fills that threshold-only methods miss
4. Adaptive threshold noise_floor + k·stddev where k is inversely modulated by coefficient of variation (dense music gets lower k, steady signals get higher k)
5. 20th-percentile noise-floor tracking over ~3 s (robust to sustained beats)
6. Silence energy gate per band — no beats when the band itself is quiet
7. 50 ms minimum inter-onset time — supports up to ~1200 BPM / 32nd notes

Drum channels are binary (0 or 1) — trivial to route into a TouchDesigner Trigger CHOP or Logic CHOP without thresholding.

Permissions & limitations

On first entry to Audio mode, iOS prompts for Microphone access. If denied, a clear overlay with an "Open Settings" deep-link appears and no audio engine starts.

System audio limitation:LOTA analyzes the microphone input only. It cannot capture system audio (Spotify, Apple Music, YouTube) because iOS sandboxing doesn't expose system audio to third-party apps. To analyze music playing on the iPhone, play it through the speakers — LOTA's mic picks it up naturally. For high-fidelity analysis, play audio from a separate source into the iPhone's mic.

TouchDesigner pattern

Drop an OSC In CHOP, point it at the phone's IP + port 9000. Enable Levels (default). Channels bass, mid, high appear and start pulsing. Toggle Beat Detection on → drums_low, drums_mid, drums_high appear as clean 0/1 signals you can route into a Trigger CHOP for beat-synced effects. FFT bands render with a red → blue color gradient (F0 = lowest, F19 = highest) so you can see where energy is on the spectrum at a glance.

Network

Streaming

Tap the transmit button (bottom right) to start streaming. All enabled transports send simultaneously. Configure transports by tapping the floating status bar pill at the top of the screen — this opens Transmission Settings, a focused sheet with Receiver IP, TCP/UDP, NDI, OSC, Point Cloud Stream, and Protocol Info sections. While streaming, the per-protocol chip in the status bar (OSC, NDI, TCP/UDP, PLY) goes from dim white to red, and a small dim text block at the bottom of the screen lists each active transmission as <protocol> <host>:<port> so you can confirm settings at a glance.

NDI

Industry-standard video-over-IP. LOTA appears as LOTA (iPhone) on your network and is auto-discovered by TouchDesigner, OBS, vMix, Resolume, and any other NDI-compatible receiver. No IP configuration needed. Optional side-by-side mode sends a 2x-wide frame with camera view on the left and depth colormap on the right.

TCP / UDP

Streams H.264-encoded video for Color and Mono modes, and raw Float32 depth maps for Depth and Point Cloud modes. TCP reconnects automatically; UDP is fire-and-forget. Set the destination IP and port in Settings.

OSC

Streams real-time camera tracking data at ~30 Hzover UDP. Point any OSC-capable tool at LOTA's IP and port to receive:

/lota/camera/position — x, y, z (3 floats)
/lota/camera/rotation — quaternion x, y, z, w (4 floats)
/lota/camera/euler — pitch, yaw, roll (3 floats)
/lota/mode — current capture mode (1 Hz)
/lota/fps — current frame rate (1 Hz)

PLY (live point cloud)

Sends live point cloud frames over TCP as packed binary (15 bytes/point: 3 floats + 3 uint8 RGB) on port 9848 by default. Pair it with the LOTABinaryPLYRecieverV2.tox drop-in component (TouchDesigner section below) — it auto-detects the binary header and only asks for the port and a Script TOP target. A legacy CSV text mode remains available for custom receivers but is deprecated and will be removed in a future update.

How to start streaming

Open Transmission Settings (tap the status bar pill)

Tap the floating status bar pill at the top of the screen. Enable the protocols you need and set the destination IP (shared across all transports) and port for each. NDI requires no configuration.

Connect to the same Wi-Fi network

LOTA and your receiving machine must be on the same local network. A 5 GHz network is recommended for lowest latency.

Tap the stream button (bottom right)

All enabled protocols start simultaneously. The status bar shows a live indicator for each active protocol.

Motion Capture

ARKit Tracking

Swipe left from the main camera page to access ARKit motion capture. Three tracking modes stream data over OSC in real time.

Body Tracking

3D skeleton detection via the rear camera. Tracks 91 ARKit joints, streams 18 key joints over OSC. Visual overlay shows white bones with green joint dots. Requires A12 chip or later.

/lota/body/skeleton — 18 joint positions
/lota/body/root — root transform
/lota/body/detected — detection state

Face Tracking

52 facial blend shapes captured via the front-facing TrueDepth camera. Overlay shows a bar graph of the top 8 most active blend shapes. Requires TrueDepth camera (iPhone X or later).

Each blend shape is sent as its own named OSC address (e.g. /lota/face/browDown_L, /lota/face/eyeSquint_R) bundled in a single UDP datagram.

Hand Tracking

Detects up to 2 hands simultaneously via the rear camera using the Vision framework. 21 landmarks per hand, streamed over OSC organized by finger. Overlay shows bone chains with joint dots — teal for left hand, orange for right.

/lota/hand/{left|right}/wrist — 3 floats
/lota/hand/{left|right}/thumb — 12 floats
/lota/hand/{left|right}/index — 12 floats
/lota/hand/{left|right}/middle — 12 floats
/lota/hand/{left|right}/ring — 12 floats
/lota/hand/{left|right}/pinky — 12 floats

Hand coordinate modes

2D (default) — normalized screen-space coordinates. Works on all iPhones. 3D (opt-in)— world-space coordinates projected via LiDAR depth. Requires a LiDAR device. Toggle in Hands Settings → 3D Hand Coordinates (tap theHands Settings button below the status bar on the ARKit Tracking page).

Integration

TouchDesigner

LOTA is built with TouchDesigner workflows in mind. Stream live depth, color, and point cloud data directly into your TD project with no capture cards or expensive rigs.

LOTABinaryPLYRecieverV2.tox — Binary Point Cloud Receiver

Drop-in TouchDesigner receiver for LOTA's binary PLY stream. Auto-detects the binary header — only asks for the receiver port and a Script TOP target. Uses numpy bulk parsing and GPU instancing to handle 49K+ points at 60 fps. Enable Binary Format in Transmission Settings → Point Cloud Stream (tap the floating status bar pill) and point this component at the same port.

Download .tox

LOTASpeechTCP.tox — Speech TCP Receiver

TCP/IP DAT with a callback script that parses LOTA's binary speech frame format and writes recognized words, partials, and finals to a speech_log Table DAT. Handles stream buffering for frames split across TCP packets. Drop into your network and recognized speech starts populating rows — no manual parsing required.

Download .tox

LOTASpeechUDP.tox — Speech UDP Receiver

UDP In DAT configured for "One Per Message" mode with a callback script that parses the same binary speech frame format as the TCP receiver. Simpler than TCP — each datagram is a complete frame, so no buffering is needed. Use this for low-latency fire-and-forget speech delivery on a local network.

Download .tox

NDI input (easiest)

Drop an NDI InTOP into your project. LOTA appears in the source list automatically. Select it, and you're receiving live video.

Point cloud via LOTABinaryPLYRecieverV2.tox

Drop LOTABinaryPLYRecieverV2.tox into your TouchDesigner network.
In LOTA, enable PLY Streaming and Binary Formatin Transmission Settings → Point Cloud Stream and note the port.
On the component, set the receiver port to match and wire its output Script TOP into your render network — live 3D geometry is ready to instance.

Camera tracking via OSC

Add an OSC InCHOP. Set the port to match LOTA's OSC port. You'll receive camera position (x, y, z), rotation (quaternion), and euler angles at 30 Hz — perfect for driving a virtual camera or triggering effects based on device movement.

3D Export

Export & 3D Pipelines

Swipe right to the Gaussian Capture page. Record datasets with ARKit intrinsics, extrinsics, and LiDAR point clouds — ready for training, viewing, or post-production.

COLMAP

Exports cameras.bin, images.bin, and points3D.bin in COLMAP binary format. Compatible with OpenSplat, gsplat, and Nerfstudio for training Gaussian Splats directly from your iPhone captures.

Nerfstudio

Exports transforms.json + images/ JPEGs + points3D.ply. Ready for Nerfstudio, splatfacto, and Instant-NGP training pipelines.

Nerfstudio + Depth

Same as Nerfstudio, plus 16-bit PNG depth maps in a depth/ folder. Best for depth-supervised training — produces better geometry on flat surfaces and uniform areas.

Point Cloud (PLY)

Standalone points3D.ply file with accumulated points from your session. Open in Blender, CloudCompare, MeshLab, or any tool that reads PLY.

Material (PBR)

Single-shot capture of a flat surface as a complete PBR material set. ZIP contains basecolor.png (sRGB), normal.png, height.png (16-bit), ao.png, roughness.png, preview.png, and a material.json manifest with patch size and tiling hints. Drop into Substance Designer/Painter, Blender, Unreal, Unity, or TouchDesigner. Requires LiDAR.

What happens during recording

Mesh overlay — A semi-transparent wireframe shows scanned surfaces building up in real time (cyan near, purple far)
Keyframe selection— Only frames where the camera moved at least 5cm or rotated ~5° are saved, producing a well-distributed set of training views
Blur detection — Motion-blurred frames are automatically rejected via Laplacian variance analysis
Focus lock — Autofocus is disabled during recording to keep camera intrinsics consistent across all frames
Haptic feedback — A subtle tap each time a keyframe is captured
Counters — Elapsed time, keyframe count, and total point count shown on screen

When you stop recording

1. Dataset files are written (metadata, point cloud, images)
2. Everything is compressed into a single .zip file
3. A summary shows format, keyframes, points, file size, and filename
4. The zip syncs to iCloud automatically if you chose an iCloud folder

Tips for best results

Move slowly

Walk around the subject at a steady pace. A typical 30-second scan produces ~80 keyframes and a 10–20 MB zip.

Capture from multiple angles

Shoot from low, mid, and high heights. Ensure adjacent viewpoints have 70–80% overlap for best reconstruction.

Use good lighting

Consistent, well-lit environments produce better results. Avoid reflective and transparent surfaces — LiDAR struggles with glass and mirrors.

Material Capture

Select Material from the format picker to swap the page from a recording flow to a plane-lock + shutter flow. One tap, ~½-second bake on iPhone 15/16 Pro at 1024², ZIP saved to your iCloud folder. Requires LiDAR.

When to use Material vs the other formats

3D scan of an object or room you can move around — use COLMAP, Nerfstudio, or Point Cloud
Tileable PBR texture of a flat surface (floor, wall, table, fabric, brick…) — use Material

Pick Material from the format picker

The page swaps to the material-capture UI: status icon, plane-status chip, Lock Plane action, and shutter button.

Pick an export folder

Tap the folder icon (bottom left) to choose an iCloud Files folder if you haven't already.

(Optional) Open Material Settings

Adjust output resolution (512 / 1024 / 2048), AO sample count (32 / 64 / 128), normal convention (OpenGL or DirectX), delight strength, and roughness scale. All persist via UserDefaults.

Point at a flat surface and tap Lock Plane

ARKit raycast detects horizontal and vertical planes. Lock Plane snaps a 20 cm screen-aligned square patch centered on what you're pointing at — output "top" = away from camera, "right" = camera's right.

Tap the shutter

The torch fires a brief flash-pair sequence (~200 ms), autofocus locks for both grabs so they share the same focal distance, then the bake runs (~220 ms at 1024² with 64 AO samples on iPhone 15/16 Pro).

Save to Files

The Material Save Summary sheet shows a live PBR sphere preview, file-size estimate, and a metallic toggle. Tap Save — ZIP is written to your iCloud folder, the sheet auto-dismisses, and the plane lock stays active for the next shot.

What's in the ZIP

basecolor.png — sRGB 8-bit albedo, white-balanced and de-lit via flash-pair specular subtraction
normal.png — linear 8-bit tangent-space normal from the LiDAR depth gradient (OpenGL +Y up by default; DirectX selectable in Material Settings)
height.png — linear 16-bit, plane-relative ±25 mm range mapped to UInt16
ao.png — linear 8-bit horizon-based ambient occlusion baked from the height map
roughness.png — linear 8-bit per-texel estimate from flash-pair specular sharpness, multiplied by your roughness scale slider
preview.png — sRGB 8-bit Cook-Torrance BRDF sphere render of the captured material
material.json — manifest with planeMeters (physical patch size), tilingHintFor1mSquare, normal convention, roughness method, metallic value, device hardware identifier, LiDAR generation, capture date, and gyro drift between flash pair

Receiver workflows

Substance Designer — drop the unzipped folder onto a new graph; Substance auto-creates a Material node with all six channels wired
Substance Painter — drag the folder into Shelf → Materials, then apply to a mesh
Blender — in the Shading editor, add an Image Texture per file and wire to a Principled BSDF; set non-color spaces on normal, roughness, ao, and height
Unreal Engine — drop the folder into the Content Browser → Unreal auto-creates a Material Instance. Switch normal convention to DirectX in Material Settings before capture so the green channel reads correctly
Unity — standard URP / HDRP Lit shader inputs map 1:1
TouchDesigner — load each PNG into a Movie File In TOP, wire to a PBR MAT

Tips for best results

Hold the phone stillduring the shutter tap — the flash-pair takes ~200 ms; the manifest's gyroDriftRadiansfield surfaces how much drift the bake saw (below 0.5° / ~0.009 rad is fine)
Capture distance— 20–60 cm from the surface gives the best detail-to-coverage ratio at the default 20 cm patch size
Lighting — ambient + the iPhone torch. Too-bright ambient washes out the controlled flash signal that drives the roughness estimate
Best surfaces — wood, fabric, leather, brick, concrete, plaster, painted drywall, asphalt, ceramic tile (matte), unpolished plastic, paper, carpet, bark, stone — anything opaque, dielectric, and roughly diffuse

Limitations & v1.2 caveats

Surfaces where the pipeline degrades or fails: glass (LiDAR passes through), mirrors and chrome (captures the reflection, not the surface), polished metal (basecolor contaminated by reflections), skin / wax / translucent plastic (subsurface scattering not modeled), velvet and fur (anisotropic / sheen BRDFs not modeled), wet surfaces (captures the wet film).

v1.2 caveats (planned for v1.3+):

Metallic is a uniform toggle (manifest only) — no per-texel metallic.png. v1.3 adds CoreML SVBRDF inference
Roughness is a heuristic (flashPairSpecularSharpness) — adjust the roughness scale slider to bias the result
Framing rect is auto-installed at 20 cm — drag handles to resize / reposition land in v1.3
Single-shot only — multi-view capture for cleaner albedo planned for v1.3

Configuration

Settings Reference

Settings are split into focused sheets reachable from two places: Transmission Settings (tap the floating status bar pill at the top of the Camera / Streaming or ARKit Tracking page — Receiver, Transport, NDI, OSC, Point Cloud Stream, Protocol Info, and a Help section with Replay Tutorial) and mode-specific settings (tap the <Mode> Settings button just below the status bar — only the section relevant to the active mode). Color and Mono have no per-mode settings, so no button is shown. The receiver IP is shared across all transports. The reference below documents every setting; the headings indicate which sheet hosts each section.

Receiver — Transmission Settings

Receiver IP— 192.168.1.100

IP address of the computer receiving streams. Shared across all transports (TCP, UDP, OSC, PLY).

Transport (TCP/UDP) — Transmission Settings

TCP/UDP Output— On

Enable or disable the video transport.

Protocol— TCP

TCP (reliable, auto-reconnects) or UDP (fire-and-forget). TCP sends H.264 video for Color/Mono, raw Float32 depth for Depth/Point Cloud modes.

Port— 9847

Destination port for TCP/UDP streams.

NDI — Transmission Settings

NDI Video Output— Off

Enable or disable NDI streaming. Auto-broadcasts as "LOTA (iPhone)" on the local network.

Side-by-Side— Off

Sends a 2x-wide frame: left half is the camera view, right half is the depth colormap. Standard format for TouchDesigner and Notch workflows.

OSC — Transmission Settings

OSC Output— Off

Enable or disable OSC messages.

Port— 9000

OSC destination port.

Depth — Depth Settings sheet

Color Map— Visible Spectrum

Colormap for depth visualization. Options: Black & White, Black Aqua White, Blue Red, Deep Sea, Color Spectrum, Incandescent, Heated Metal, Sunrise, Visible Spectrum.

Point Cloud — Point Cloud Settings sheet

Frame Window— 30 (range 5–60)

Number of accumulated LiDAR frames in the live point cloud sliding window. Higher values show more spatial coverage but use more GPU memory. Only affects the live view, not Gaussian Capture.

Max Depth— 5.0m (range 1–10m)

Maximum LiDAR range. Points beyond this are discarded. Gen 1 LiDAR (iPhone 12–14 Pro) is reliable to ~5m. Gen 2 (iPhone 15–16 Pro) can reach ~10m. Affects both live view and Gaussian Capture exports.

Compute Quality— Balanced

GPU compute frame skip for thermal management. Full = every frame, Balanced = every 2nd, Efficient = every 3rd. Only affects live Point Cloud mode.

Min Confidence— Medium+

LiDAR depth confidence filter. All = no filtering, maximum density. Medium+ = removes low-confidence noisy edge pixels. High Only = fewest points, highest accuracy. Affects both live view and Gaussian Capture exports.

Blob Tracking — Detection (Blob Track Settings sheet)

Base Style— Color

What the underlying camera layer looks like behind the rectangle outlines. Color (live RGB), Mono (grayscale), Mask (grayscale subject silhouette on black), or Binary (white-on-black silhouette — authentic TouchDesigner Blob Track TOP look).

Draw Blob Bounds— On

Draw 1-pixel hairline rectangle outlines around detected blobs. Turn off to see only the base layer.

Show ID Labels— Off

Draw #id text labels just outside the top-right corner of each blob bounding box.

Blob Color— Pure green

RGB color used for both rectangle outlines and ID labels.

Blob Tracking — Depth (Blob Track Settings sheet)

Min Depth— 0.5 m (range 0.1–5.0 m)

Near edge of the depth slab. Pixels closer than this are excluded from blob detection.

Max Depth— 3.0 m (range 0.5–10.0 m)

Far edge of the depth slab. Pixels farther than this are excluded from blob detection.

Min Confidence— Medium+

LiDAR depth confidence filter (same semantics as Point Cloud). All / Medium+ / High Only. Pixels below the threshold are excluded from detection.

Blob Tracking — Constraints (Blob Track Settings sheet)

Min Blob Size— 50 px (range 10–500)

Minimum pixel area to count as a blob. Filters out single noise pixels and tiny artifacts.

Max Blob Size— 30000 px (range 100–49152)

Maximum pixel area. Rejects the "everything is one blob" failure mode where the entire scene merges together.

Max Move Distance— 0.15 (range 0.01–0.5)

Maximum normalized distance a blob can travel between frames and keep its ID. Increase for fast-moving subjects or higher-FPS scenes.

Blob Tracking — Revival (Blob Track Settings sheet)

Revive Blobs— On

Re-identify lost blobs with the same ID if they reappear inside the revival window. Mirrors TouchDesigner's blob lifecycle exactly.

Revive Time— 0.5 s

Seconds a lost blob is kept alive for revival before it transitions to expired and is permanently dropped.

Revive Distance— 0.2

Maximum normalized distance for the revival match. A lost blob must reappear within this radius to recover its ID.

Include Lost in OSC— Off

Emit lost blobs in the OSC bundle with state = 2.

Include Expired in OSC— Off

Emit expired blobs in the OSC bundle with state = 3.

Point Cloud Stream (PLY) — Transmission Settings

PLY Streaming— Off

Enable or disable live point cloud TCP stream.

Port— 9848

PLY stream destination port.

Binary Format— Off (always leave on)

Packed binary (15 bytes/point) vs CSV text. Binary is ~40% smaller, parses ~3× faster on the receiver, and is the format the supported LOTABinaryPLYRecieverV2.tox drop-in component targets — turn it on for every new capture. CSV text mode remains for backwards compatibility with custom receivers but is deprecated and will be removed in a future update.

Tracking — Body / Face / Hands Settings sheets

Skeleton Overlay— On

Show or hide body skeleton visualization on the tracking page.

Face Overlay— On

Show or hide face blend shape bar graph on the tracking page.

Hand Overlay— On

Show or hide hand bone and joint visualization on the tracking page.

3D Hand Coordinates— Off

Use LiDAR-projected world-space hand coordinates instead of normalized screen-space. Requires a LiDAR-equipped device.

Transcription — Transcription Settings sheet

Send Per-Word OSC— On

Send each recognized word as it arrives (/lota/speech/word + TCP/UDP word frames). The /lota/speech/word_count integer always fires alongside per-word messages as a numeric CHOP-visible signal.

Send Partial Transcript— On

Send running partial transcript updates as words arrive (/lota/speech/partial + TCP/UDP partial frames).

Send Final Transcript— On

Send finalized sentences when the recognizer commits (/lota/speech/final + TCP/UDP final frames).

These toggles apply to both OSC and TCP/UDP transports simultaneously. Turn off anything you don't need to reduce noise in your receiver.

Device Motion — Motion Settings sheet

Acceleration (X, Y, Z)— On

Gravity-removed acceleration in G-force, sent as three OSC floats per update (/lota/motion/accel/x, /y, /z).

Gyroscope (X, Y, Z)— Off

Rotation rate in rad/s, three OSC floats per update (/lota/motion/gyro/x, /y, /z).

Compass Heading— Off

Magnetic heading in degrees 0–360 (requires Location permission). Address /lota/motion/heading.

Barometric Pressure— Off

Atmospheric pressure in kPa plus relative altitude in meters (/lota/motion/pressure, /lota/motion/altitude).

Update Rate— 30 Hz

30 / 60 / 100 Hz picker. 30 Hz matches ARKit. 100 Hz is useful for latency-critical controllers.

Each enabled sensor draws a scrolling graph lane on screen and streams over OSC. Toggling a sensor off mid-session removes its graph lane and stops its OSC channel within ~50 ms (debounced to prevent crash loops from rapid toggles). Permissions are requested upfront when entering Motion mode.

Audio Analysis — Audio Settings sheet

Levels (Bass / Mid / High)— On

Continuous 0–1 energy per frequency band with rolling auto-gain. /lota/audio/bass, /lota/audio/mid, /lota/audio/high.

Beat Detection— Off

Binary 0/1 switch per band on detected onset (50 ms gate window). /lota/audio/drums/low, /lota/audio/drums/mid, /lota/audio/drums/high.

Dynamics— Off

/lota/audio/burst — fast transient detector, pulse-shaped 0–1 (rises instantly, decays over ~200 ms).

FFT Spectrum (20 bands)— Off

20 log-spaced FFT bands across 20–20,000 Hz, each normalized 0–1. /lota/audio/fft/0 through /lota/audio/fft/19, rendered with a red → blue color gradient.

Update Rate— 30 Hz

30 / 60 Hz picker. Analysis runs internally at ~86 Hz and decimates to the chosen output rate.

Microphone permission is requested upfront when entering Audio mode. Uses the same NSMicrophoneUsageDescription as Transcription mode. LOTA can only analyze the microphone input — iOS sandboxing prevents capturing system audio from other apps.

Inclusive Design

Accessibility

LOTA is designed to meet Apple's App Store accessibility standards. Every feature works for every user.

VoiceOver

Every control is labeled and announced. Mode switches, streaming state, and recording status are all spoken aloud.

Voice Control

Say stream, record, or switch to Depth and LOTA responds. Every button is discoverable by voice.

Dynamic Type

Text scales up to 200%+. The UI reflows to a single-column layout at extreme sizes so nothing gets cut off.

Dark Interface

Designed dark from the start. Every screen, menu, and control uses a true dark color scheme for comfortable use in any environment.

Differentiate Without Color

Status indicators swap to distinct symbols when this setting is enabled. Shapes, icons, and text labels replace color as the sole differentiator.

Increase Contrast

Swaps blur materials for solid backgrounds and boosts status colors for guaranteed readability over any camera feed.

Reduce Motion

All transitions respect the system Reduce Motion setting. Visual feedback stays, decorative animation goes.

Support

Frequently Asked Questions

Which iPhones support LOTA?

Any iPhone running iOS 26.2 or later. LiDAR features (Depth, Point Cloud, Blob Track, Gaussian Capture, Material Capture, 3D hand coordinates) require iPhone 12 Pro or later. Color, Mono, Transcription, Motion, and Audio modes, NDI streaming, and TCP/UDP streaming all work on iPhones without LiDAR. Face tracking requires TrueDepth camera (iPhone X or later). Body tracking requires A12 chip or later. iPad Pro models with LiDAR are also supported.

Does Transcription mode need internet access?

No. Transcription uses iOS 26's on-device SpeechAnalyzerframework and runs entirely offline. After you grant Speech Recognition permission on first launch, LOTA prefetches the on-device speech model for your locale in the background — by the time you open Transcription mode the model is on disk, so first captions arrive without a download wait. If the prefetch didn't finish, the model installs lazily on first use as a fallback. Audio never leaves your device.

Do the receiving machine and iPhone need to be on the same network?

Yes. For TCP, UDP, OSC, and PLY streaming, both devices must be on the same local network. NDI also uses the local network but handles discovery automatically. A 5 GHz Wi-Fi connection is recommended.

What software can receive LOTA streams?

Any NDI-compatible software (TouchDesigner, OBS, vMix, Resolume), any tool that reads TCP/UDP sockets, and any OSC-capable application (Max/MSP, Ableton, Unreal Engine).

How do I use the export data for Gaussian Splat training?

Export a COLMAP or Nerfstudio dataset from the Gaussian Capture page, transfer the zip to your training machine, and point OpenSplat, Nerfstudio, or gsplat at the extracted folder. The files are in the exact format these tools expect.

Is there a latency cost to streaming?

NDI and TCP/UDP streams typically add 1–3 frames of latency depending on your network. OSC tracking data arrives at 30 Hz with sub-frame latency. A wired connection or 5 GHz Wi-Fi keeps things as fast as possible.

What export format should I use?

COLMAP for OpenSplat and gsplat. Nerfstudio for splatfacto and Instant-NGP. Nerfstudio + Depth for best geometry on flat or featureless surfaces. Point Cloud (PLY) for quick visualization in Blender or CloudCompare.