car. The purpose of this work is to explore how machine synthesized reading is
perceived by users. Are the users willing to tolerate deficiencies of machine
synthesized speech and trade it off for more current content? What is the impact
of listening to it on driver’s distraction? How do the answers to the questions
above differ for various types of text content? Those are the questions we try to
answer in the presented study. We conducted the study with 12 participants,
each facing three types of tasks. The tasks differed in the length and structure of
the presented text. Reading out a fable represented an unstructured pleasure
reading text. The news represented more structured short texts. Browsing a car
manual was an example of working with structured text where the user looks
for particular information without much focusing on surrounding content. The
results indicate relatively good user acceptance for the presented tasks. Distraction of the driver was related to the amount of interaction with the system. Users opted for controlling the system by buttons on the steering wheel and made
little use of the system’s display.
Keywords: Architectures for interaction, CUI, SUI ad GUI, HCI methods and
theories, Interaction design, Speech and natural language interfaces, Long text
reading, car, UI, LCT.
1 Introduction
Drivers are well accustomed to listening to radio, music or audio books. The quality
of machine synthesized speech is however still inferior to performance of a professional speaker reading out a text tailored for audio presentation. However, it is much
slower, less flexible and more expensive to create such content.
The purpose of the study presented in this text is to learn to what extent the user is
willing to cope with the deficiencies of text to speech synthesis (TTS).
Text processing is one of the activities humans do frequently. It ranges from passive reading to text creation, error correction and team collaboration. Users tend to
shift most of their activities conducted previously on desktop to mobile environment.
They even want to perform certain tasks in a car while driving. User interfaces for
Long Text Reading in a Car 441
mobile devices however have to respect a smaller form factor, less efficient input
methods and distraction caused by using the system in a car. We addressed the tasks
of text creation and correction in our previous work [2]. In this paper we focus on an
apparently less difficult but important task of text reading.
2 Related Work
Significant attention was devoted in the past to assessing the impact of various in-car
activities [1]. The Lane Change Test (LCT) [9] and subjective tests using questionnaires such as NASA TLX [5] and DALI [6], [7] are examples of popular methods
used to assess the impact of various secondary in-car tasks on the primary task of
driving.
Although electronic systems are more and more abundant in cars, which rightfully
causes worries about their impact on driving, communication between the driver and
passengers is frequent and hardly can be regulated [4]. The negative impact on driving performance due to having conversation with someone while driving was assessed
by various studies [15], [16].
Several approaches to designing speech-based UIs for in-car usage including
menu-based and search-based UIs were described [8], [10].
General quality of various TTS systems can be effectively measured only on the
basis of reliable and valid listening tests, e.g. using mean opinion scale [18]. TTS
quality was also assessed in terms of its suitability for various tasks such as computer
assisted learning of foreign languages [17]. In this study we try to show that the quality of today’s state-of-the-art TTS systems is sufficient for reading out texts in a car.
3 Research Goals and Experiment Design
The purpose of this study is to analyze the usability and distraction aspects of text
reading in a car in general. The research questions that we search answers for are of
three categories: usability, distraction and performance.
• Usability: Is the TTS quality sufficient for this kind of task? What part of the implemented functionality is actually used by the user? What are the preferred control
mechanisms (buttons vs. swipe gestures, audio vs. visual feedback)? What are the
preferred usage patterns (auto-playback vs. manual browsing through the text)? Is
there correlation between the results and personal information about the subjects?
• Distraction: What levels of distraction can we observe for each of the tasks? How
is distraction perceived subjectively? How often and for how long do the users look
at the screen?
• Performance: Does the user remember what has been read?
We decided to carry out tests using three scenarios: ‘Fable’, ‘News’ and ‘Car Manual’. They differ in the complexity of information, in the structure of the presented
text and in the ways the user is allowed to inter
0 Comments