Virtualphones Technology comfortably surrounds you with sound without covering your ears.
Virtualphones Technology (VPT™) localizes individual sounds around the listener to give them a more immersive experience where they can easily differentiate sounds as if they were coming from speakers in the front and rear. When used along with VR/AR technology, the user can become so immersed in the sound they almost feel like they are experiencing the location and the situation in real life.
For N, the neckband-style wearable device (hereinafter referred to as the "neckband") contains speakers that enable the user to listen to audio directly from the neckband. VPT reproduces realistically enhanced audio effects through the following process. It calculates and controls characteristics of the sound from the speakers to the eardrums for best results. In addition, appropriate reverberation is added to naturally blend the sound being reproduced with the surrounding environment. This allows the sound coming from the speakers in the neckband to transfer to the ears in a more natural manner.
To improve user safety, voice audio playback is oriented toward the front. This prevents the illusion of hearing voices to the rear which may cause the user to mistakenly look backward.
Clear Phase technology provides clear and natural sound, without extra bulk or weight.
Clear Phase uses high-precision digital signal processing to smooth the amplitude characteristics of the speakers and provide more linear phase characteristics. This technology provides a natural sound reproduction and a clear sound image position.
For N, the speakers are placed in the neckband-style wearable device (hereinafter referred to as the "neckband") in order to direct the sound outward from the listener. Thus the users are able to feel as if the sound is woven with the surrounding space. However, this tends to reduce the level of the high-frequency range reaching the ears.
The use of small sized speakers results in a relatively high f0 (lowest resonance frequency). This is similar to open-ear earphones that, due to their special structure, also have different resonant frequency characteristics when compared with conventional earphones.
The special features of Clear Phase allow high-precision correction of audio characteristics, resulting in clear and natural sound reproduction with small sized speakers.
xLOUD is N's automatic sound-adjusting function.
It adjusts the volume depending on the ambient sound of your surroundings.
xLOUD analyzes the input signal and applies non-linear amplitude conversion according to the signal characteristics.
This audio signal processing technology allows the user to get a sufficient input signal playback level without any clipping.
For N, environmental noise signal input via the neckband microphone is analyzed, and the enhancement level of xLOUD is automatically controlled according to the level of noise. This allows easy understanding of voice audio information and music in places with a large amount of surrounding noise.
During analysis of the environmental noise, the microphone input signal may be mixed with signals other than regular environmental noise such as feedback from the speakers, outdoor wind noise, and noise from the unit housing when the user moves around. In N, xLOUD uses environmental noise analysis technology that takes advantage of multiple microphone beamforming to automatically control the volume without being affected by such signals.
Multiple microphone beamforming
Multiple microphone beamforming lets you control N by voice commands, even in noisy environments.
Beamforming is a powerful noise suppression technology to allow voice recognition in noisy environments, even with microphones located away from the mouth.
For N, four microphones are arranged around the neckband-style wearable device. In the same manner as humans distinguish particular sounds among the surrounding noise by using the sound pressure difference and time difference of arrival between their left and right ears, N actively uses the subtle differences between four microphones to better discriminate components of sound from the user's mouth and reduce other sources of noise. This allows voice control through microphones located away from the user's mouth.
In addition to outdoor environmental noise, there are other types such as wind noise from activities such as riding a bicycle, and other noise caused by body action when the device is placed around the neck. The particular shape of the N enclosure allows optimal positioning and orientation of the four microphones, to effectively reduce the impact of such types of noise.
N's Voice Recognition gives you the option of easy hands-free control.
The device is designed to understand simple voice commands. Natural language processing powered by cloud technologies makes this possible.
N’s Voice Recognition technology uses both internal device and cloud-based voice recognition engines to take advantage of the varying vocabulary size of each engine.Depending on the content of user voice control, network connection status, and usage environment, the system is able to choose between the internal or the cloud-based voice recognition results. This technology provides appropriate feedback to the user.
For N, basic operations are handled by the device internal voice recognition engine for quick response. For more natural speech, the system connects over the internet to the powerful cloud-based recognition engine. The system treats basic voice commands such as "Next song" and "Previous song" differently to complex natural speech such as "What is the weather like in San Francisco now?" to give the user highly flexible hands-free voice control.
Auto tilt adjustment camera
N's Auto-Tilt Adjustment camera recognizes the inclination of a user’s posture and remains horizontal by rotating itself, automatically adjusting to take pictures which are close to the user's field of vision.
The auto tilt adjustment camera estimates the user's angle of lean and automatically adjusts the camera tilt to ensure that it is horizontal. This allows photography at an appropriate angle, similar to the user's line of sight.
For N, an ultra-compact camera with a single axis of rotation has been installed on the front of the neckband-style wearable device (hereinafter referred to as the "neckband"). The system estimates the user's angle of lean using a tiny sensor installed inside. It then calculates the desired angle of rotation and automatically rotates the camera so that it remains horizontal at all times. It automatically takes pictures according to the Context Recognition function, which can detect that the user is doing activities such as running or cycling. This provides the user with a handy tool for logging their daily activities.
To ensure privacy, the camera is stowed away when not in use. It is rotated and exposed only when actively recording images.
Spatial Acoustic Conductor
N's open-ear earphones have Spatial Acoustic Conductors, so external sounds remain audible. This means that you can enjoy listening to music while having a conversation with friends at the same time.
The Spatial Acoustic Conductor utilizes Sony's unique sound duct design to allow the user to listen to their desired voice audio or music while being able to hear ambient sounds.
The Spatial Acoustic Conductor has been specially designed so that it does not block the ear canal. This gives the user the ability easily talk with people while simultaneously listening to music or voice audio. Sound generated by the driver unit that sits behind the ear is transmitted via the specially designed sound duct and directed out of the opening toward the ear canal.
In addition to securing the earphones to the ear, the guide ring also helps to direct sound from the sound duct toward the eardrum and ear canal. This results in a rich low frequency sound. The guide ring has been specially designed so that is does not affect the user's ability to hear the surrounding environment.
It has an extremely lightweight design that sometimes causes users to forget that they are even using earphones. Parts of the guide ring and sound duct that come into contact with the ear are made from flexible material that is able to fit well to varying ear shapes. This design provides excellent comfort even after extended periods of use.
N's Context Recognition System analyzes data from its built-in accelerometer and GNSS receiver to detect your activities and location. For example, it can tell whether you're walking or riding a bike.
Context Recognition infers the user’s current context based on time series data capturing the details of user’s activities over time. One can smartly control applications using such inferred context.
For N, an accelerometer and a GNSS receiver are installed inside the neckband-style wearable device (hereinafter referred to as the "neckband").
The accelerometer generates vibration data reflecting the user’s movement. The deep learning, state-of-the-art machine learning technique, processes the data, and classifies in almost real time the user’s movement into one of walking, running, cycling, or standing still.
The GNSS receiver produces the user’s location data, which are clustered into representative locations according to their spatio-temporal density. A machine learning algorithm then extracts time-related feature vectors for each place and classifies them into home, workplace, and so on.
The N integrates these processes as a Context Recognition technology, and provides a piece of information to the user in a timely manner. For instance, it can notify the user of the weather forecast near their office when they are about to go to work. It also can timely provide them with some news while cycling and stopping at a traffic signal.
The Context Recognition consists of sensing and machine learning techniques. Such technology generally increases power consumption in addition to requiring a great amount of memory and computational power, which makes it difficult to implement on a wearable device. However, we are able to overcome these difficulties by developing a low power sensing processor and optimally offloading the Context Recognition from the main CPU.