Dang, swearing was one of my strategies to get the bot to forward me to a representative
🤖 I’m a bot that provides automatic summaries for articles:
Click here to see the summary
According to a report from the Japanese news site The Asahi Shimbun, SoftBank’s project relies on an AI model to alter the tone and pitch of a customer’s voice in real-time during a phone call.
SoftBank’s developers, led by employee Toshiyuki Nakatani, trained the system using a dataset of over 10,000 voice samples, which were performed by 10 Japanese actors expressing more than 100 phrases with various emotions, including yelling and accusatory tones.
By analyzing the voice samples, SoftBank’s AI model has reportedly learned to recognize and modify the vocal characteristics associated with anger and hostility.
In a Reddit thread on Softbank’s AI plans, call center operators from other regions related many stories about the stress of dealing with customer harassment.
Harassment of call center workers is a very real problem, but given the introduction of AI as a possible solution, some people wonder whether it’s a good idea to essentially filter emotional reality on demand through voice synthesis.
By reducing the psychological burden on call center operators, SoftBank says it hopes to create a safer work environment that enables employees to provide even better services to customers.
Saved 78% of original text.
Interacting with people whose tone doesn’t match their words may induce anxiety as well.
Have they actually proven this is a good idea, or is this a “so preoccupied with whether or not they could” scenario?
Am I crazy or is 10,000 samples nowhere near enough for training people’s voices?
I don’t think it seems like too few samples for it to work.
What they train for is rather specific. To identify anger and hostility characteristics, and adjust pitch and inflection.
Dunno if you meant it like that when you said “training people’s voices”, but they’re not replicating voices or interpreting meaning.
learned to recognize and modify the vocal characteristics associated with anger and hostility. When a customer speaks to a call center operator, the model processes the incoming audio and adjusts the pitch and inflection of the customer’s voice to make it sound calmer and less threatening.
This is giving me Black Mirror vibes. Like when that lady’s consciousness got put into a teddy bear, and she only had two ways to express herself:
- Monkey wants a hug
- Monkey loves you
I get that you shouldn’t go off on customer service reps (the reason you’re angry is never their fault), but filtering out the emotion/intonation in your voice is a bridge too far.
Most of the time angry customers don’t even understand what they’re angry at. They’ll 180 in a heartbeat if the agent can identify the actual issue. I agree, this is unnecessary.
Based on my experience working in a call center, I wouldn’t call it unnecessary. People are fucked up.
I did phones in a different century, so I don’t know whether this would fly today. But, my go-to for someone like this was “ok, I think I see the problem here. Shall we go ahead and fix it or do you need to do more yelling first?”
I can’t remember that line ever not shutting them down instantly. I never took it personally, whatever they had going on they were never angry at me personally.
Then again, I do remember firing a couple of customers (“we don’t want your business any more etc”) after I later became a manager and people were abusive to staff. So you could be right, also.