Skip to content

Commit 31b4428

Browse files
committed
update docs
1 parent e6c87ff commit 31b4428

File tree

90 files changed

+3469
-14143
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

90 files changed

+3469
-14143
lines changed

README.md

Lines changed: 19 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ VisionProTeleop
22
===========
33

44

5-
> **🎉 UPDATE: Now supporting Low-Latency Video Streaming!** You can now stream back your robot's camera feed back to Vision Pro via webRTC protocol, alongside the original hand tracking data stream. No complicated network setting required. Update the app, `pip install --upgrade avp_stream`, and you're done!
5+
> **🎉 UPDATE: Now supporting Low-Latency Video Streaming!** You can now stream back your robot's video/audio feed back to Vision Pro via webRTC protocol, alongside the original hand tracking data stream. No complicated network setting required. Update the app, `pip install --upgrade avp_stream`, and you're done!
66
77

88
<div align="center">
@@ -22,23 +22,31 @@ VisionProTeleop
2222

2323

2424

25-
This VisionOS app and python library streams your Head + Wrist + Hand Tracking result via gRPC over a WiFi network, so any robots connected to the same wifi network can subscribe and use. **It can also stream stereo (or mono) camera feeds from your robot, back to the Vision Pro.**
25+
This VisionOS app and python library streams your Head + Wrist + Hand Tracking result via gRPC over a WiFi network, so any robots connected to the same wifi network can subscribe and use. **It can also stream stereo (or mono) video / audio feeds from your robot, back to the Vision Pro.**
2626

27-
> **For a more detailed explanation, check out this short [paper](./assets/short_paper_new.pdf).**
27+
> **For a more detailed explanation, check out this short [paper](./assets/short_paper_new.pdf).*
2828
2929

3030
[![Star History Chart](https://api.star-history.com/svg?repos=improbable-ai/visionproteleop&type=date&legend=top-left)](https://www.star-history.com/#improbable-ai/visionproteleop&type=date&legend=top-left)
3131

3232

3333
## Benchmark Results
3434

35-
We performed comprehensive glass-to-glass latency measurements to evaluate the end-to-end performance of our video streaming system. The results show consistently low latency across all tested resolutions, with wired connections achieving **~20ms** at lower resolutions and wireless connections maintaining **~50-100ms** even at 4K.
35+
We performed comprehensive round-trip latency measurements to benchmark our video streaming system. The measurement captures the full cycle:
36+
1. Python encodes a timestamp into a video frame as a marker
37+
2. WebRTC transmission happens over the network
38+
3. Vision Pro decodes the image, and reads the marekr
39+
4. sends timing data back via gRPC
40+
5. Python calculates latency.
41+
42+
This provides a conservative upper bound on user-experienced latency. According to our own testing, the system can consistently hit under 100ms both in wired mode and wireless mode for resolution under 720p. When wired up (requires developer strap), you can get stable 50ms latency even for **stereo 4K streaming**.
3643

3744
For detailed methodology, test configurations, and complete results, see the **[Benchmark Documentation](docs/benchmark.md)**.
3845

3946
![](comparison.png)
4047

4148

49+
4250
## How to Use
4351

4452
If you use this repository in your work, consider citing:
@@ -92,7 +100,7 @@ while True:
92100

93101
### Step 4. [🎉V2 Update🎉] Stream video feeds back to Vision Pro!
94102

95-
Streaming your robot's video feed back to Vision Pro requires one additional line: `start_video_streaming`. This feature is only supported on the latest version of the VisionOS app, and python package. Make sure you upgrade both python library / visionOS app.
103+
Streaming your robot's video feed back to Vision Pro requires one additional line: `start_streaming`. This feature is only supported on the latest version of the VisionOS app, and python package. Make sure you upgrade both python library / visionOS app.
96104

97105
```python
98106
from avp_stream import VisionProStreamer
@@ -101,8 +109,8 @@ s = VisionProStreamer(ip = avp_ip)
101109

102110
# you can simply start a video stream
103111
# by defining which video device you want to use
104-
s.start_video_streaming(device="/dev/video0", format="v4l2", \
105-
size="640x480", fps=30, stereo=False)
112+
s.start_streaming(device="/dev/video0", format="v4l2", \
113+
size="640x480", fps=30, stereo_video=False)
106114

107115
while True:
108116
r = s.latest
@@ -116,25 +124,24 @@ You can also:
116124
s = VisionProStreamer(ip = avp_ip)
117125
# define your own image processing function, and register
118126
s.register_frame_callback(my_own_processor)
119-
s.start_video_streaming(device="/dev/video0", format="v4l2", \
127+
s.start_streaming(device="/dev/video0", format="v4l2", \
120128
size="640x480", fps=30)
121129
```
122130

123131
- send over as a stereo camera feed (assumes side-by-side concatenated image)
124132

125133
```python
126134
s = VisionProStreamer(ip = avp_ip)
127-
s.start_video_streaming(device="/dev/video0", format="v4l2", \
128-
size="640x480", fps=30, stereo=True)
135+
s.start_streaming(device="/dev/video0", format="v4l2", \
136+
size="640x480", fps=30, stereo_video=True)
129137
```
130138

131139
- work without a physical camera and send over synthetically generated frames (i.e., simulation renderings, or purely synthetic images)
132140
```python
133141
s = VisionProStreamer(ip = avp_ip)
134142
# define your own image generating function, and register
135143
s.register_frame_callback(synthetic_frame_generator)
136-
s.start_video_streaming(device = None, format = None, \
137-
size="1280x720", fps=60)
144+
s.start_streaming(size="1280x720", fps=60)
138145
```
139146

140147
which is explained in detail in [examples](examples) folder.
@@ -214,19 +221,3 @@ You can also modify the video viewport -- where and how the streamed video is pr
214221

215222
We acknowledge support from Hyundai Motor Company and ARO MURI grant number W911NF-23-1-0277.
216223

217-
<!-- Misc
218-
219-
If you want to modify the message type, feel free to modify the `.proto` file. You can recompile the gRPC proto file as follows:
220-
221-
#### for Python
222-
223-
```bash
224-
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. handtracking.proto
225-
```
226-
227-
228-
#### for Swift
229-
```bash
230-
protoc handtracking.proto --swift_out=. --grpc-swift_out=.
231-
```
232-
After you recompile it, make sure you add it to the Xcode so the app can use the latest version of the swift_proto file. -->

Tracking Streamer/ContentView.swift

Lines changed: 0 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -23,17 +23,6 @@ struct ContentView: View {
2323
}
2424
.padding(.top, 32)
2525

26-
if showSettings {
27-
VStack(spacing: 16) {
28-
Text("Python Server IP:")
29-
.font(.title2)
30-
TextField("e.g., 10.29.239.70", text: $pythonServerIP)
31-
.textFieldStyle(.roundedBorder)
32-
.font(.title3)
33-
.padding(.horizontal, 32)
34-
.multilineTextAlignment(.center)
35-
}
36-
}
3726

3827
// Two start buttons side by side
3928
HStack(spacing: 40) {
@@ -119,19 +108,6 @@ struct ContentView: View {
119108

120109
// Settings and Exit buttons
121110
HStack(spacing: 24) {
122-
Button {
123-
showSettings.toggle()
124-
} label: {
125-
ZStack {
126-
Circle()
127-
.fill(Color.gray.opacity(0.3))
128-
.frame(width: 50, height: 50)
129-
Image(systemName: "gearshape.fill")
130-
.font(.title2)
131-
.foregroundColor(.white)
132-
}
133-
}
134-
.buttonStyle(.plain)
135111

136112
Button {
137113
exit(0)

Tracking Streamer/ImmersiveView.swift

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -560,15 +560,15 @@ class VideoStreamManager: ObservableObject {
560560

561561
if attempt % 10 == 0 && attempt > 0 {
562562
print("⏳ [DEBUG] Still waiting for WebRTC server info... (\(attempt)s elapsed)")
563-
print("💡 [DEBUG] Make sure start_video_streaming() was called in Python")
563+
print("💡 [DEBUG] Make sure start_streaming() was called in Python")
564564
}
565565

566566
try await Task.sleep(nanoseconds: 1_000_000_000) // 1 second
567567
}
568568

569569
guard let info = webrtcInfo else {
570570
print("❌ [DEBUG] Timeout: WebRTC server info not received")
571-
print("💡 [DEBUG] Make sure start_video_streaming() was called in Python")
571+
print("💡 [DEBUG] Make sure start_streaming() was called in Python")
572572
return
573573
}
574574

Tracking Streamer/StatusView.swift

Lines changed: 28 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -106,7 +106,7 @@ struct StatusOverlay: View {
106106
}
107107

108108
private var minimizedView: some View {
109-
HStack(spacing: 24) {
109+
HStack(spacing: 16) {
110110
// Expand button
111111
Button {
112112
withAnimation(.spring(response: 0.45, dampingFraction: 0.85)) {
@@ -117,9 +117,9 @@ struct StatusOverlay: View {
117117
ZStack {
118118
Circle()
119119
.fill(Color.white.opacity(0.3))
120-
.frame(width: 60, height: 60)
120+
.frame(width: 42, height: 42)
121121
Image(systemName: "arrow.up.left.and.arrow.down.right")
122-
.font(.system(size: 24, weight: .bold))
122+
.font(.system(size: 18, weight: .bold))
123123
.foregroundColor(.white)
124124
}
125125
}
@@ -135,9 +135,9 @@ struct StatusOverlay: View {
135135
ZStack {
136136
Circle()
137137
.fill(Color.blue.opacity(0.8))
138-
.frame(width: 60, height: 60)
138+
.frame(width: 42, height: 42)
139139
Image(systemName: videoMinimized ? "video.fill" : "video.slash.fill")
140-
.font(.system(size: 24, weight: .bold))
140+
.font(.system(size: 18, weight: .bold))
141141
.foregroundColor(.white)
142142
}
143143
}
@@ -153,9 +153,9 @@ struct StatusOverlay: View {
153153
ZStack {
154154
Circle()
155155
.fill(videoFixed ? Color.orange.opacity(0.8) : Color.white.opacity(0.3))
156-
.frame(width: 60, height: 60)
156+
.frame(width: 42, height: 42)
157157
Image(systemName: videoFixed ? "lock.fill" : "lock.open.fill")
158-
.font(.system(size: 24, weight: .bold))
158+
.font(.system(size: 18, weight: .bold))
159159
.foregroundColor(.white)
160160
}
161161
}
@@ -167,24 +167,24 @@ struct StatusOverlay: View {
167167
ZStack {
168168
Circle()
169169
.fill(Color.red)
170-
.frame(width: 60, height: 60)
170+
.frame(width: 42, height: 42)
171171
Text("")
172-
.font(.system(size: 27, weight: .bold))
172+
.font(.system(size: 20, weight: .bold))
173173
.foregroundColor(.white)
174174
}
175175
}
176176
.buttonStyle(.plain)
177-
.confirmationDialog("Are you sure you want to exit?", isPresented: $showExitConfirmation, titleVisibility: .visible) {
178-
Button("Exit", role: .destructive) {
179-
exit(0)
180-
}
181-
Button("Cancel", role: .cancel) {}
182-
}
183177
}
184-
.padding(30)
178+
.padding(20)
185179
.background(Color.black.opacity(0.6))
186-
.cornerRadius(36)
180+
.cornerRadius(25)
187181
.fixedSize()
182+
.confirmationDialog("Are you sure you want to exit?", isPresented: $showExitConfirmation, titleVisibility: .visible) {
183+
Button("Exit", role: .destructive) {
184+
exit(0)
185+
}
186+
Button("Cancel", role: .cancel) {}
187+
}
188188
}
189189

190190
private var expandedView: some View {
@@ -822,14 +822,14 @@ struct StatusPreviewView: View {
822822
let videoFixed: Bool
823823

824824
var body: some View {
825-
HStack(spacing: 24) {
825+
HStack(spacing: 16) {
826826
// Expand button (non-functional in preview)
827827
ZStack {
828828
Circle()
829829
.fill(Color.white.opacity(0.3))
830-
.frame(width: 60, height: 60)
830+
.frame(width: 42, height: 42)
831831
Image(systemName: "arrow.up.left.and.arrow.down.right")
832-
.font(.system(size: 24, weight: .bold))
832+
.font(.system(size: 18, weight: .bold))
833833
.foregroundColor(.white)
834834
}
835835

@@ -838,35 +838,35 @@ struct StatusPreviewView: View {
838838
ZStack {
839839
Circle()
840840
.fill(Color.blue.opacity(0.8))
841-
.frame(width: 60, height: 60)
841+
.frame(width: 42, height: 42)
842842
Image(systemName: "video.fill")
843-
.font(.system(size: 24, weight: .bold))
843+
.font(.system(size: 18, weight: .bold))
844844
.foregroundColor(.white)
845845
}
846846
}
847847

848848
ZStack {
849849
Circle()
850850
.fill(videoFixed ? Color.orange.opacity(0.8) : Color.white.opacity(0.3))
851-
.frame(width: 60, height: 60)
851+
.frame(width: 42, height: 42)
852852
Image(systemName: videoFixed ? "lock.fill" : "lock.open.fill")
853-
.font(.system(size: 24, weight: .bold))
853+
.font(.system(size: 18, weight: .bold))
854854
.foregroundColor(.white)
855855
}
856856

857857
// Close button (non-functional in preview)
858858
ZStack {
859859
Circle()
860860
.fill(Color.red)
861-
.frame(width: 60, height: 60)
861+
.frame(width: 42, height: 42)
862862
Text("")
863-
.font(.system(size: 27, weight: .bold))
863+
.font(.system(size: 20, weight: .bold))
864864
.foregroundColor(.white)
865865
}
866866
}
867-
.padding(30)
867+
.padding(20)
868868
.background(Color.black.opacity(0.6))
869-
.cornerRadius(36)
869+
.cornerRadius(25)
870870
.fixedSize()
871871
.opacity(0.5) // 50% transparent
872872
}

0 commit comments

Comments
 (0)