Human Movement Tracking as Fine Grained Position Input for Wide Area Virtual Reality

The paper describes current progress towards providing untethered relative location tracking in a wide area setting for virtual reality applications. The goal is to allow a user to walk and turn in a virtual space by walking in the real world. Our implementation uses accelerometer and gyroscope sensors on to the user’s ankles to detect and track leg motion. Tracking is detailed, picking up not just steps, but also the size and timing of those step. Estimated location change information is communicated wirelessly to a stand-alone virtual reality headset where it is used to drive player movement in a game setting. Small scale testing has established that the system provides a comfortable movement experience in which users can confidently get from point to point. It has also identified issues concerning: maintenance of stability in direction estimation; detection of sideways steps; and lag from detection to observed movement.


Introduction
The ultimate goal of virtual reality technology is to allow a person or group of people to experience being in a new environment in such a way that it seems as though they are really there. The sensation of presence in such an environment would involve vision, sound, feeling, and smell. The ability to act in an environment would include being able to look around, to move one's body, to grasp and interact with objects, to climb on objects, etc. Current VR applications can provide good experiences with some of the capabilities.
Being able to stand in one place and look around requires the vision component, coupled to orientation tracking of the user's head. Such a system provides a powerful experience for users, allowing them to clearly appreciate size and relationships of items in the environment. Adding positional sound can further enhance the experience. In recent years great progress has been made in the performance and accessibility of virtual reality vision and sound systems. Headsets are now readily available which allow a wide visual field to be presented, and in which the view can be made to respond to head movement with minimal lag. Popular current systems include the Occulus Rift and HTC Vive. Google Cardboard and Daydream systems; and Occulus Go are examples of vision systems built using mobile technology.
Adding the ability to move in an environment is a natural next step. Loosely gathering terminology from [1] and [2] we can classify movement systems in VR systems as: • Teleportation: The user issues a command, usually by pointing a hand held controller and clicking. They are then instantly placed at the destination location. A visual fade-in/fade-out transition may help, but disorientation and motion sickness is likely. • Artificial Locomotion. The user moves continuously in the virtual space using a controller to set direction and speed, without movement in real space. The metaphor of controlling a vehicle is often used -e.g. flying a spaceship. The inconsistency between real and visual motion can also induce motion sickness. • Perambulation or Natural Locomotion: Picking up the actual movement of the user in real space. This is the system which is the least disorienting and least likely to cause motion sickness. Redirected motion [3] can be used to give the impression of moving in a larger area than that actually used. There are many existing approaches to the implementation of perambulation movement.
The Occulus Rift and HTC Vive systems use cameras to track the user's headset and hand controllers. A typical setup involves two cameras, mounted on opposite corners of an area of up to 4m by 4m. The user can move freely in this space, with the proviso that their headset is tethered by a cable feeding video and other data, their movement being directly reflected in the virtual experience. The fidelity is good, but the space is very limited. In particular it is not large enough for motion redirection. Other systems allow for larger spaces. Motion capture studios have users wear special suits (marked or augmented with reflectors) and track movement with a number of cameras ranged about the capture space. Capture spaces can be quite large, even 10's of metres on a side. Such a system is used for game playing commercially by Zero Latency [4] where cameras track user's headsets and weapons in a warehouse sized space. The space is large enough for motion redirection.
Systems which provide arbitrarily large virtual spaces include the omnidirectional treadmill [5] and human sized hamster ball [6]. There are also a number of commercial treadmill style movement systems. Typical is the Virtuix Omni [7] in which the user's feet slip on a basin shaped surface to allow stepping motion in any direction, with the feet slipping back to a centre position after each step. These systems allow large virtual spaces, but have unnatural motion.
In summary, current VR motion systems can offer accurate motion detection in limited spaces or they can offer unnatural or limited accuracy detection in large spaces. Our project uses Human Movement tracking to allow accurate motion detection in a large (outdoor) space.

The System
The system we are developing consists of a portable (untethered) head mounted display. At present we are using a Google Pixel phone in a Daydream headset [8]. The Google Daydream system is used to host a virtual reality game environment developed using Unreal Engine 4. The user wears gyroscope and accelerometer sensors on each ankle. Rather than building a system around the sensor electronics ourselves, we are using cell phones in simple holders (sold as arm-bands for joggers). The ankle phones transmit movement data to the display phone which updates the player position in the virtual world accordingly. The result is a system in which a user can walk in the real world, over a large area, and have that movement, or a modified version of it, reflected in a virtual space.
The advantage of the system is the detail with which movement can be captured. It combines the fidelity of small scale systems with a wide area for motion. Our system allows: • Movement in any direction. The user can step forward or backward, left or right. They can turn and walk in any direction. • Movement direction is independent of view direction. The user can look around freely as they move.
• Movement is captured in fine detail. The user may lift a foot into the air, then pull back. They may move quickly or slowly. The view in the headset reflects these motions. Note that we cannot track sustained motion of this kind. The system expects the foot to be put back on the ground frequently. • The scale of movement is great enough to implement redirection, including redirection of user orientation. We have done some experiments, including having a user walk around the inside of a cylinder. (In that virtual world, it was assumed that gravity acted radially.)

Implementation
Software running in each of the ankle phones receives inputs from Gyroscope and Accelerometer sensors at approximately 300Hz. Motion is reconstructed from these values as follows.
When the user is stationary, with feet flat on the ground, an averaged accelerometer reading is taken, to be used as an estimate of gravity (approx. 9.81 m/ sec/sec upward). The phones are each mounted on the outside of the leg, just above the ankle. They are flat against the leg and oriented roughly upright in portrait mode. The axis system of the phone sensors therefore has x pointing mostly forward, y mostly upward and z mostly outward. It cannot be assumed that the phones are exactly upright, or that they do not move a little against the legs as the user walks. The gravity estimate gives part of the initial orientation of the phone. It will be re-estimated frequently as explained later. The initial orientation from gravity is just a measure of how upright the phone is. We take the initial horizontal direction to be towards 'North' -or any chosen direction in the virtual world.
Motion estimation begins by subtracting the gravity estimate from measured acceleration values to provide acceleration due to movement. In human leg movement there is considerable rotation (most about the z axis) from the knee. This is accounted for in our software by integrating information from the gyroscope to track changes in the phone orientation and using this to continually convert acceleration readings to the coordinate system in which the gravity estimate was taken.
The next step is to integrate the acceleration values to give velocity and velocity to give position change. As is well known, the double integration is not reliable, typically giving rise to large velocity and consequentially position errors. In our case however, we can take advantage that each foot rests on the ground while the other is moving. We need only integrate for the duration of a step. When the foot returns to the ground, integration stops. Advantage is taken of periods on the ground to repeat the gravity estimation. New gravity estimates allow us to correct some drift in the orientation. In particular, if a phone has slipped in its holder, or the holder has slipped against the user's leg, changes to vertical orientation can be corrected, making sure that gravity can be reliably subtracted from observed acceleration. Correction is performed as a rotation making minimal change to the dimensions of orientation related to horizontal direction. The result preserves a horizontal direction that is the result of continuous gyroscope integration. In early experiments we have found the accuracy to be moderate. In one experiment a walk around a circle of radius 35m ended with total drift of 20 degrees. The direction drift is such that orientation does not maintain a precisely fixed relationship with the real world, but does change slowly enough for the change to be imperceptible -rather like the effect of deliberate orientation redirection.
The explanation given so far is complete except for the issue of deciding when a step is taking place. We have observed that steps begin with a sharp acceleration upward. At least, that is the case when walking forward or backward. When walking slowly sideways a person may only raise their foot off the ground slightly and then move horizontally. Our system has a hand coded automaton to track a step, driven mostly by vertical acceleration and experimentally determined thresholds. When there has been a short vertical acceleration upward, integration begins. Integration is stopped just before the foot hits the ground. The sharp and fluctuating acceleration values observed after that time are not used. When there has been no vertical acceleration for a short time, gravity estimation for the next step can begin. The automaton checks that the sequence of times and accelerations observed is consistent with stepping and abandons tracking of a step when unexpected inputs are observed, so that position integration is not done when movement is not as expected. We have observed that it detects steps in steady forward walking with 99% accuracy. Results with horizontal movement are poor (< 50%).
The display phone (in the headset) is configured as a WiFi hotspot. Ankle tracking phones are connected to the hotspot and send changes in integrated position using TCP messages in real time. There is a possibility that message aggregation might cause lag, and it may be preferable to use UDP datagrams instead, but we have not investigated the issue yet. There is a lag in onset of movement detection and transmission caused by the vertical acceleration detection requirement of the step detection automaton. After that, response time depends on data transmission time and the responsiveness of the game engine and display.
Data received from the ankle phones is used to drive player movement in a UE4 game. We have coded a module to establish network connection with phones and pass that data into the game. There are trade-offs in applying movement to players. Locations can be directly changed on each frame of game animation, giving smooth movement with little lag. However this relinquishes the option of integration with character walking animation and game physics simulation. Our current system applies movement increment 'requests' which cause animation and can be redirected on collisions. This requires buffering of received position changes and can introduce both lag and inaccuracy in position change.
We have yet to investigate proper integration of our step detection with game walking animation. Issues with the system at present include difficulty in detecting sideways steps, drift in horizontal direction, and possibly lag caused by the step detection automaton and integration into the game system. We are experimenting with the use of compass sensors for improving accuracy of horizontal direction tracking. At present the system does not have safety features. As users are walking blindly about in real space it is necessary to have an observer watching to ensure that they are not getting physically close to the edge of a physical playing area. In addition our practical experiments have used a virtual play space with clear boundaries that is small in comparison to the physical space available and have been of short enough duration to be certain that position drift could not lead users into danger.

Results
We [9] have tested the system with 8 participants in on a rectangular virtual environment with a position grid marked on the virtual ground, a number of signposts of different kinds and a small ramp/bridge construction. In reality participants were walking on a large sports ground that was flat and without obstacles, and large enough that they were not in danger of reaching the edge. The experience was that participants were able to successfully move as requested in the space, and found the experience pleasant. It was interesting to observe that they started tentatively but were soon moving confidently.