My relation with Taylor Swift is complicated: I don't hate her
— in fact, she seems like a very nice person. But I definitely hate her
songs: her public persona always comes up to me as entitled, abusive, and/or an
unpleasant person overall. But what if she didn't have to be? What if we could
take her songs and make them more polite? What would that be like?
In today's post we will use the power of science to answer this question.
In particular, the power of Natural Language Processing (NLP) and word
The first step is deciding on a way to model songs. We will reach into our
NLP toolbox and take out
semantics, a research area that
investigates whether words that show up in similar contexts also have similar
meanings. This research introduced the idea that once you treat a word like a
number (a vector, to be precise, called the embedding of the word), you
can apply regular math operations to it and obtain results that make sense.
The classical example is a result shown in
this paper, where
Mikolov and his team managed to represent words in such a way that the
result of the operation
King - man + woman ended up being very
The picture below shows an example. If we apply this technique to all the
Sherlock Holmes novels, we can see that the names of the main characters are
placed in a way that intuitively makes sense if you also plot the locations for
"good", "neutral", and "evil" as I've done.
Mycroft, Sherlock Holmes' brother,
barely cares about anything and therefore is neutral; Sherlock, on the other hand,
is much "gooder" than his brother. Watson and his wife Mary are the least
morally-corrupt characters, while the criminals end up together in their own
corner. "Holmes" is an interesting case: the few sentences where people
refer to the detective by saying just "Sherlock" are friendly scenes, while the
scenes where they call him "Mr. Holmes" are usually tense, serious, or may even
refer to his brother. As a result, the world "Sherlock" ends up with a positive
connotation that "Holmes" doesn't have.
This technique is implemented by
word2vec, a series of
models that receive documents as input and turn their words into vectors.
For this project, I've chosen the
gensim Python library. This
library does not only implement
word2vec but also
doc2vec, a model that will do all the heavy-lifting for us when it
comes to turn a list of words into a song.
This article is the fifth of a series in which I explain what my
research is about in (I hope) a simple and straightforward manner. For
more details, feel free to check the Research
In my last post
we faced a hard problem: If a person visits a museum, for
instance, we could give them information on the piece they are looking at.
But computers don't have eyes! We could use a camera, sure,
but that only works if there is only one art piece nearby. If there are several
paintings close to each other, how do we decide which one of them
is the interesting one?
One way is through what we call eye-tracking. This technology works
like a regular camera, but with a catch: it doesn't only look forward,
but it also looks backwards, at you! If you wear one of these so-called
eye-trackers, it follows the movement of your eyes and
records not only the entire scene (like a regular camera) but also a tiny
dot that points out what you were looking at.
Some colleagues and I found that eye-movement gives you a very good
guess at what has captured someone's attention. After all, if you are interested
in something, you are probably looking at it.
But there's a complication: eye-trackers are bulky, expensive, and take a
long time to set up. And most people feel uncomfortable
knowing that someone is recording their activity all the time.
It is safe to say that we won't be wearing eye-trackers for fun anytime soon,
and that's not great: what good are our results, if no one wants to use them?
Luckily, a man named John Kelleher came up with a smart idea: whenever
we are interested in an object, we look at it and get closer.
He then applied this idea backwards: if we are looking in a
certain direction and walking towards it, all we need to do is figure out what
is right in front of us - that must be the object we care about.
This technique is called visual salience, and it's a good alternative
to an eye-tracker: rather than wearing expensive glasses, all we
need to know is the direction in which they are walking. It might not be as
effective, but it's good enough for us.
Following people's attention is important if we want our computers to cooperate
with us: if a computer asks you to turn on the lights, but you start walking
towards the fire alarm, it should warn you immediately that you are about to
make a mistake. How to correct that mistake, however, is the topic of the
next (and final) article.
So, you have successfully created an online community. People seem
genuinely engaged, and you have interesting discussions going on. And then
one day I show up, decide that "it would be a shame if something were to
happen to your little communnity", and start harrassing your users because...
well, because. Call it 4chan, Gamergate, MRA or trolls, there's always a
group ready to drag a community into the ground.
Like I said last time, one of the main characteristics
about the internet is that you can't block me, you can only
block my user. So let's focus, from the simplest to the more
complex, in how could you keep me from being annoying and/or harrassing
other people in your community.
The first step I suggest you take is a hierarchical scale of users. It
doesn't have to be too complex - I'd start with something like this:
Anonymous users are those that have not yet logged in.
Usually they are allowed read-only access to the site, but in some
cases not even that. As a counter-example Slashdot
is known for allowing anonymous users to post and comment on the site,
although with a catch that I'll discuss later.
New users should have limited posting capabilities
- maybe they can only vote but not comment, or their comments are
given partial visibility by default. Getting out of this category should be
relatively easy for a "good" user (although time-consuming - no less than
an hour, perhaps even days), but it should definitely annoy those that
are only "giving the website a try".
Your regular users are the ones that actually use
your site as intended. They can post and comment at will. And finally,
the power users are allowed some extra
permissions - usually this mean they can edit or remove other people's
posts. This level should be pretty hard to achieve.
The iron fist of justice
Now that you have user levels, new users are your main concern: it is
not unusual for trolls to create thousands of accounts (automatically,
of course) and use them to assault a particular user. Remember: any regular user
should be able to stop the noise in a simple and straightforward way -
otherwise you risk becoming an online harrassing platform, and you'll have
to publicly apologize like Twitter's CEO often does.
Our first moderation tool will be karma points. Each time a user
contributes to our website, other users can rate this contribution positively
or negatively. Contributions with "high karma" will be given a
predominant position, while contributions with "low karma" will be buried.
This is how Slashdot can allow anonymous contributions without being
buried in dumb comments: every comment posted anonymously will have very
low karma by default, but if enough users vote it up, it will eventually
be seen by everyone else. Similarly, Hacker News
will not allow users to vote negatively if they haven't yet reached a
certain karma threshold.
Sidenote: you don't want to rank your posts/comments simply based on
who has the highest number of votes. Instead, take a look at
reddit's comment sorting system.
Another tool you'll find useful is the good old ban. A temporary ban
means that a given user cannot post for a given period of time, while
a permaban (permanent ban) means that the user is kicked out
forever. This is a standard tool in every forum, but we can still do
better: given that nothing stops a banned user from creating
a new account and continue their toxic behavior (and remember, now they
are pissed for being banned), you can use a hellban.
When a user is hellbanned, no one but them can see their activity. The
user can still log in, comment and post, but this activity is invisible
to everyone else. From their point of view, it looks as if no one cares
about them anymore, and it's not unusual for them to just leave.
Finally, you might also want to consider a "report" button, through
which users can report unruly behavior. This should be more or less
automated, but you cannot blindly trust these reports: you risk trolls
banding together and reporting users at will. To prevent this, an automated
recourse method should be enough - a moderator is notified, and the user
is not fully banned until a final decision is reached. And finally, if
you want to go the extra mile, you could have a "protected" flag that
keeps certain users from being reported.
That's about all you can do at this level. There are no new ideas here,
which is good - now you know that these concepts have been tried and
tested before. In next two posts I'll be discussing
about things that might not make as much sense, so stay tuned.
Once upon a time, you would create an e-mail account and use it for a
long time without receiving spam. In fact, whenever you received your
first spam message, you'd know exactly who to blame: that one cousin of
yours who'd send you every single motivational powerpoint she came
across, along with a list of 1500 other e-mail addresses. We could
argue about who's the spammer in this situation, but that discussion
will have to wait.
That kind of control over your account is no longer possible: even if
you never share your account with anyone, you will at some point get
spam. It's just the way things are, the "background radiation" of the
internet. Luckily for us, things got so bad that a lot of smart people
sat down to think really hard about this, and came up with
filtering, a technique so effective that most of us don't even bother
checking our Spam folders anymore.
So we1 succeeded once. It's a good thing to remember, because we have
a much harder battle to fight now: trolling, and it's ugly cousin, online
Let's say you post a message on an online board. These are some of the
things that could happen, in no particular order:
- You could get an interesting, well thought reply (note that "well
thought" doesn't mean "agrees with you"). It happens.
- You could be modded down by people that disagree with what you just
posted, even if the rules say they shouldn't.
- You could be flooded by negative messages, because a certain group
decided to impose their point of view. This is called brigading, by the
way, and it's usually not personal - they oppose your point of view, but
- You could be flooded by negative messages, because a group has
decided to target you online for something you said, or did, or
- You could be posting in behalf of a company, in order to speak in
favor of your products posting as anyone-but-an-employee. This is called
being a shill, and most
websites either pretend that it doesn't happen or they don't care.
- You could be trying to derail a discussion, in order to make sure
a certain point is not brought to light, or is drowned in the noise.
This usually implies that you work for a government agency, it's being
done right now, and it works.
We used to believe that everyone on the internet would eventually
behave nicely, and that we could build our services based on trusting the
95% of users that have no hidden agenda. This is sadly not so, because
- ... people have not behaved nicely on the Internet since
- ... 5% of very loud users are a lot more noticeable than 95% of the
quiet ones. A post-mortem of a DARPA Challenge showed that a single person
can sabotage the work of thousands of well-meaning volunteers.
In the follow-up articles I'm going to comment on what I perceive to
be three main points in which this issue could be attacked. They are
- Anonymity: there's no way of taking measures against a person,
only against a user. This is by design, and I'm not arguing
that we should get rid of anonymity. We should instead focus on
identifying toxic users, which I think can be done implementing
- Flamewars: derailing discussions in order to kill them. This
may be a job for pattern matching, identifying when the shape of a
discussion is tending towards known anti-patterns. We might also
want to add clustering, in order to identify brigades.
- Harrassment: perhaps the harder one, requires sentiment analysis
techniques to identify negative comments and kill them before they
reach their destination.
In the follow-up essays I'll present some papers about how one would
go about attacking each point. I have no reason to believe that this techniques
are unknown (some of them are already implemented), but I post
them hoping that, much like Bayesian filtering, someone will read them
and have an "oh, wait" moment).
Coming up next: anonymous users and user groups.