Table of Contents |
---|
Background
Automotive Grade Linux (AGL) is a collaborative open source project that is bringing together automakers, suppliers and
technology companies to accelerate the development and adoption of a fully open software stack for the connected car.
Being a part of speech expert group, Amazon (Alexa Automotive) team intends to collaborate to help define the voice
service APIs in AGL platform.
...
- I should be able to follow the guidelines of the AGL speech framework to plugin my voice agent.
- I should be able to follow the guidelines of the AGL speech framework to plugin my wake word solution.
- I should be able to follow the guidelines of the AGL speech framework to plugin my NLU engine.
High Level Architecture
Quoting AGL documentation,
http://docs.automotivelinux.org/docs/apis_services/en/dev/reference/signaling/architecture.html#architecture“
“Good practice is often based on modularity with clearly separated components assembled within a common framework. Such modularity ensures separation of duties, robustness, resilience, achievable long term maintenance and security.”
High Level Components
Architecture
The Voice Services architecture in AGL is layered into two levels. They are High Level Voice Service layer and vendor software layer. In the above architecture, the high-level voice service is composed of multiple bindings APIs (colored in green) that abstract the functioning of all the voice assistants running on the system. The vendor software layer composes of vendor specific voice agent software implementation that complies with the Voice Agent Binding APIs.
...
<Yet to be standardized>
}
}
3) Configuration
- Provides mechanism for OEMs to configure its functionality. OEMs should be able to configure
- List of active agents
- Assign roles and responsibilities of each agent
- Language setting
- Default Agent
- Enable/Disable Fallback Invocation mode
- Enable/Disable Agent Switching during multi turn dialog
- ... more
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
vshl/setActive Activate or deactivate a voice agent. "permission": "urn:AGL:permission:speech:public:accesscontrol" Request: { "agent_id":"integer" "is_active":"boolean" } Responses: { "jtype":"afb-reply", "request": { "status":"string" // success or bad-state or bad-request } } } |
...
afb-voiceservice-wakeword-detector
...
Provides an interface primarily for the core afb-voiceservice-highlevel to listen for wakeword detection events and make request routing decisions.
- This binding will internally talk to or host voice assistant vendor specific wake word solutions to enable the wake word detection.
Voice Agent Vendor Software
1) voice-agent-binding
- The API specification of voice agent is defined in this document. All the vendor specific voice agent bindings will follow the same specific to integrate with the high level voice service.
- Voice Agent will listen to audio input when instructed by the high level voice service.
- Voice Agent will run its own automatic speech recognition, natural language processing, generates intents to perform requested action.
- Voice Agent will have its own authentication, connection and dialog management flows. And generates events to notify the high level voice service of its state transitions.
- Voice Agent will use the high level voice service's interaction manager to command system applications to perform tasks, like Route to a specific geo code, Dial a Number, Play music etc.
...