-
Notifications
You must be signed in to change notification settings - Fork 286
Transform/Projectionmatrix of current frame (Locatable camera) #83
Comments
Hi @Daniel4144, This is not supported out of the box. There are several ways to do that; I suggest you have a look at the discussions on webrtc-uwp/webrtc-uwp-sdk#10, w3c/strategy#133, https://groups.google.com/forum/#!topic/discuss-webrtc/CcJnxzUsVBE, and the example at https://github.com/phongcao/webrtc-mrvc-sample. There is so far no clean agreed upon solution, and those feel more like hacks, so I am a bit reluctant to engage in any work for MixedReality-WebRTC. |
Let me be a bit more clear: I totally see the value of this feature for MR apps, and I am sure you are not the only one to want it. But at this time we are focusing our efforts in other areas where there is a clear path for improvement, ahead of the v1.0 release and the HoloLens 2 shipping to public, whereas this problem is still not well defined in my opinion. I would like to see some kind of consensus among MR people, possibly a draft standard or at least a cleaner solution than hacking RTP headers, and then I could look at adding an implementation for MixedReality-WebRTC. But as interesting as it is, we don't have the dev resources at this time to drive that research ourselves unfortunately. For your particular application, I feel maybe the synchronization with video is not as critical as other use cases I know of, so maybe using data channels is enough? |
I understand your point, unfortunately I do not have enough time to implement a custom solution either. |
I will leave that issue open as a feature request. We can re-prioritize if there's more demand. |
Almost like this, I am still looking for other ways. |
Hey @zhangazheng, |
@Daniel4144 |
I can add a bit more context, for the record if nothing else. I've done this using just the data channel and was impressed at the low latency, though it was between peers on the same network. But even then, there was a bit of "swim". The magic of HoloLens is how solidly the rendered objects are anchored to the real world, and you really need per-frame accuracy to maintain that. The https://github.com/phongcao/webrtc-mrvc-sample project (referenced above) is specific to the VP8 codec (or it might be VP9). There is some unused space in the encoded frame header for that codec that is used to embed a frame ID. The camera transform for that frame is then sent along the data channel along with the ID. When the video frame is received, the ID is extracted and used to look up the transform for that frame. It works very well, but only for the VP8/9 codec and uses reserved space in the header that is arguably risky. It also requires customization of the public WebRTC code base. WebRTC now has a multiplex codec that allows metadata to be multiplexed with each frame. This provides a mainstream way to embed a frame ID, or even the camera transform itself, with each frame. But the base WebRTC code currently has no public APIs to provide metadata input to encoding and extract it again upon receipt of each frame. Also problematic is that SDP (to my knowledge) doesn't allow for negotiation of nested chains of codecs. So if you negotiate mutiplex, there has to be a hard coded assumption of what video codec it wraps -- which in the Google codebase now is VP9. So you lose the benefits of codec negotiation with this approach. I would also love to see any enhancement along these lines, but I think it would require non-trivial work in the base WebRTC and may not even be possible at all without imposing constraints like forcing the use of specific codecs. |
All very good info, thanks @kspark-scott, and the others who also contributed. I also would be interested to see something done in this area, there seems to be some demand and this is well inside the boundaries of the project. I think we will consider it for a next milestone. |
Note to self. Link to webrtc multiplex issue https://bugs.chromium.org/p/webrtc/issues/detail?id=9632 |
@Daniel4144 I would like to get some more details on how you were able to get the projection matrix and transform from the RGB camera while running the webrtc unity app. |
Hi @astaikos316, |
@Daniel4144 I guess I am confused as to how to use the matrix to take an object placed on the 2D view from a desktop app to a world coordinate on the hololens. |
@astaikos316 I'll try to explain what I did:
One more thing I've noticed: The mrc I get back from the hololens is shifted to the left relative to the camera image (but gets more accurate with greater distance), so I couldn't use the mrc to check the positioning, but had to check how it actually looks on the hololens. Maybe @djee-ms knows something about this: is it possible or even necessary to calibrate the mrc (it is always shifted to the left)? The only option I could find is to turn it on/off. |
Hi @Daniel4144 , regarding alignment, there are known alignment issues with MRC on HoloLens 1. Specifically, alignment drifts as you get further away from the focus point/plane for the scene. Might that be what you're seeing? If you're not familiar with this you can read at least a high level summary here (search for section |
According to this https://docs.microsoft.com/en-us/windows/mixed-reality/focus-point-in-unity setting the focus point manually should not be necessary (if Enable Depth Buffer Sharing is set), but I'll give it a try. Thanks for the hint @kspark-scott. |
@kspark-scott beat me to it. Yes this is likely the alignment drift from using the wrong focus point. |
@Daniel4144 i am having trouble I believe reversing the direction of the ray direction I get from the 2D view on the desktop. Right now what is occurring is that the holograms are appearing behind me instead of where i think they would. I've tried changing the sign of the direction ray, but when I do that I cannot get holograms to instantiate at all. I am also not sure if I am properly changing the direction to world space once transmitted to the hololens. What function would be used for that? |
You can simply use the Unity transform-functions (https://docs.unity3d.com/ScriptReference/Transform.TransformDirection.html): Regarding my mrc problem: setting the focuspoint manually for every frame solved it, I think the shaders I used didn't write into the depth buffer, so the automatic approach I linked above did not work. |
@Daniel4144 i am not sure where to set the focus point manually for every frame. I'm also just using standard MRTK shaders for this project. |
Set it in any Update() in your scene like in the example https://docs.microsoft.com/en-us/windows/mixed-reality/focus-point-in-unity |
These assumptions are going to get challenged with HoloLens 2, as a believe neither of them will be true then. |
@Daniel4144 et. al. thank you for all the valuable information and discussions so far. One other thing I cannot figure out is which values are the camera offsets that you described earlier or where should i set them and apply them to? I've tried adding some offsets to a few values but nothing has worked so far. |
@Daniel4144 I am still having trouble withe the offsets and figuring out where those values are or where to add them in. |
What happens on Hololens2 ? |
@zhangazheng The HoloLens 2 uses eyetracking to ensure a more correct and proper placement of holograms. As such, the transformation between the headset and eyes will not be constant as it will change depending on which user is wearing the HoloLens as well as how they place it on their head. It can also change runtime if the user repositions the HoloLens. The projection matrix is also dynamic as the projection also changes with the position of the user's eyes in relation with the display. TLDR; Neither the projection matrix nor the virtual-camera-to-physical-camera transform is constant. |
Guys. Any good idea? Dynamics 365 remote assis is very stable. I guess that is because MS can get the deep level information. |
Any idea about what? I don't think Remote Assist uses WebRTC, so they won't have the issue of finding a way to send the head position through the WebRTC protocol. |
At a minimum, it is possible to get the camera to world matrix locally whenever the local video frame is ready? I'm trying to do something more static. I was able to accomplish this previously with the WebRTC-universal-samples project. From my remote source, I want to send a command (can be done over the data channel) to my HoloLens 2 that causes a local positional "snapshot" which stores off the head position local to the device when the command is received. I do this so I can place an object in my scene according to commands given at the remote viewing station. The first step is creating the snapshot, and then later commands use this to position new objects. |
I want to know how to get a still image and world matrix while WebRTC video streaming is active. My use case is almost the same as @Peskey 's.
I don't have to send information via WebRTC. |
@Peskey have you been able to get the camera projection matrix from the Hololens 2? I have tried for a few days now since I just got my headset, but I only get an identity matrix no matter what resolution I set and trying to use the Unity Photocapture API. |
@djee-ms will this be a feature? WebRTC in Mixed Reality is fairly useless for my purposes without the ability to get the camera projection matrix associated with my video stream. With webrtc-uwp-sdk being deprecated, it's important for me to find an alternative that will run on the HoloLens 2, but also supports getting the camera matrix. I couldn't see how to do this without modifying multiple levels of your code. Is there a simpler way I'm missing |
I am trying to take a photo with shared mode MediaCapture while using WebRTC.
This method works well when I used with webrtc-uwp-sdk. codevar frameSource = await GetFrameSourceAsync();
var result = await StartMediaFrameReaderAsync(frameSource); private async Task<FrameSource> GetFrameSourceAsync()
{
var frameSourceGroups = await MediaFrameSourceGroup.FindAllAsync();
foreach (var sourceGroup in frameSourceGroups)
{
foreach (var sourceInfo in sourceGroup.SourceInfos)
{
if (sourceInfo.MediaStreamType == MediaStreamType.VideoRecord
&& sourceInfo.SourceKind == MediaFrameSourceKind.Color)
{
return new FrameSource()
{
Group = sourceGroup,
Info = sourceInfo
};
}
}
}
return null;
}
private async Task<bool> StartMediaFrameReaderAsync(FrameSource frameSource)
{
try
{
var mediaCapture = new MediaCapture();
settings = new MediaCaptureInitializationSettings()
{
SourceGroup = frameSource.Group,
SharingMode = MediaCaptureSharingMode.SharedReadOnly,
MemoryPreference = MediaCaptureMemoryPreference.Cpu,
StreamingCaptureMode = StreamingCaptureMode.Video
};
await mediaCapture.InitializeAsync(settings);
}
catch (Exception ex)
{
Debug.Log(ex.Message);
return false;
}
...
} |
I've found the cause.
|
@tarukosu |
wonder if there's any progress now? When webrtc can get the localtable camera data |
@zhangazheng have you solved this problem ? thank you .谢谢! |
Hi,
I am currently using this project to stream mixed reality capture from a Hololens to a desktop application, which works great so far.
Now I want to visualize the mouse position on the mrc from desktop, in 3d space on the hololens. Therefore I need the position of the camera from where the frame was captured.
Is it possible to get the projectionmatrix / transformationmatrix of the camera for each frame?
Thanks in advance!
The text was updated successfully, but these errors were encountered: