MCP servers are getting more popular. However they lack ability of visual interaction.I suggest creating a dedicated api which will describe all available screen features and ways to call them.For example I am watching a youtube video.Current activit...