Wednesday, January 16, 2019

Internal Google docs show they censor conservative speech

Breitbart has gotten hold of internal Google communications which show that the company is manually censoring views it holds to be wrong.

For example Google has manually changed the results you get when you search for Abortion on YouTube in order to downgrade pro-life videos.  Another example is that immediately after one left wing journalist complained about the search results that showed up when someone searched for "Maxine Waters" the results "improved".

A Google Trust and Safety team member, Daniel Aaronson, wrote

"These lines[what to censor] are very difficult and can be very blurry, we are all well aware of this. So we’ve got huge teams that stay cognizant of these facts when we’re crafting policies considering classifier changes, or reacting with manual actions – these decisions are not made in a vacuum, but admittedly are also not made in a highly public forum like TGIF or IndustryInfo (as you can imagine, decisions/agreement would be hard to get in such a wide list – image if all your CL’s were reviewed by every engineer across Google all the time)."

This is an admission that if submitted to a large group there would not be a consensus on what to censor. Given that the "huge" team is drawn from a very ideological monolithic group--remember that Google fired an engineer who said that there may be fewer female software engineers not because women are any less intelligent or competent but because fewer women are interested in being software engineers-- Aaronson is telling us that Google censors based on a single ideology.

While he admits that the "lines"--what to censor--are blurry he seems unconcerned that Google is effectively acting as though they're not.

Further he seems completely oblivious to the fact that crowdsourcing truth doesn't work.  He writes:

 "So imagine a classifier that says, for any queries on a particular text file, let’s pull videos using signals that we historically understand to be strong indicators of quality (I won’t go into specifics here, but those signals do exist)."

Ask yourself what sort of information these signals can be based on?  AI is not to the point where it can understand video content and even if it were the idea of AI defining what is good or bad, true or false would depend entirely on how the AI was programmed.  If a Nazi programs an AI it will find that saying Jews are evil is ok but if a Jew programs it it will find that Nazi's are evil is ok.  We can agree that in this case the Jew got it right but the point is that AI is like all other software; garbage in results in garbage out.

So the only other information available is how people react to a video. Essentially that means Google can look at who likes/dislikes a video, what comments about the video say, and what sites link to the video.

But for signals based on those sources to be valid the inputs have to be unbiased. For example if leftists are more likely to declare that an undercover video is a lie, even when it's the truth, than conservatives any signal based on how often the word "lie" is associated with a video is not an objective measure of quality but only a poll based on a statistically bad sample--as though we polled 80% Democrats and 20% Republicans about how well Trump is doing.

Similarly looking at what sites link to a video is highly subjective.  For example the Southern Poverty Law Center used to be a civil rights organization but now it's a extremist left wing hate group that says that anyone who opposes the redefinition of marriage is an extremist hater.  Yet Google views the SPLC as a reliable source. Hence if the SPLC links to a factually inaccurate video Google, if it uses that as one of its "signals", will declare the video to be true. Similarly leftists at Google probably rate pro-life sites as unreliable so that if they link to a video that video will be classified as false.

The core problem is that any "signal" is based on Google's definition of what is good and what is bad; i.e. the NYT is a reliable source but Breitbart is not.  Change those assumptions and the algorithm will generate completely different results.

This is why censorship of anything other than illegal things--child porn, calls to kill Blacks, etc-- will never work because even if we assume that the people at Google are sincerely trying to cull out lies their ideological blinders will ensure that it's voices that they disagree with that will be censored.


No comments: