cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
AlexK
New member
Status: New idea

MCP servers are getting more popular. However they lack ability of visual interaction.

I suggest creating a dedicated api which will describe all available screen features and ways to call them.

For example I am watching a youtube video.
Current activity shows the contend and declares what it can do to an LLM in a format like
[{
"description": "rewind video on integer number of seconds, where positive means forward and negative means backwards",
"callback": callback,
},
{
"description": "stop video", ...
},
{
"description": "find video by search term" ...
}...]

Using this interface anyone can create an app which could be used hands free via voice commands or via autonomous assistant.

3 Comments
Status changed to: New idea
Jon
Community Manager
Community Manager

Thanks for submitting an idea to the Mozilla Connect community! Your idea is now open to votes (aka kudos) and comments.

AlexNahas
New member

This is the thought behind. /WebMCP 

Come check it out. It’s all open source and contributors are welcome

AlexNahas
New member

Come check out WebMCP! https://github.com/MiguelsPizza/WebMCP 

It’s exactly as you describe