What we're interested in is the analysis of music video (or opera) structure. The primal steps we might adopt is as follows:
- Audio
- Separate the non-music (speech, silence) and music (pure-music, vocal) parts.
- Extract the vocal in the music parts got from step 1.
- Identifying roles by voice.
- Video
- Face (role) detection
- Identifying roles by face (costume) recognition
After getting the clues from audio and video, we would try to analysize the social relationship between the roles
- Social Relationship
- Who is the leading role
- How much relevence is between any pair of two roles
If those goals above can be achieved succesfully, some application can be done:
- Application
- Give users only the fragments of those actors/actresses he/she is interested in
- The graphics of roles relationship
- A simple script. I mean the program can automatically make marks for different roles, this make the audience easier to understand the story.
沒有留言:
張貼留言