Categories
SogetiLabs Posted

GIGO as applied to AI

copyright James Cornehlsen

The concept of “GIGO”–Garbage In, Garbage Out–has been around almost as long as computer programming itself.

GIGO is the idea that, no matter how well written and definitive a computer program or algorithm is, if you feed it bad data the resulting output will be “bad”–i.e., have no useful meaning or, at worst, misleading meaning.

Nothing surprising here–as programmers we are well aware of this problem and often take great pains to protect an algorithm implementation against “Garbage In”.

It’s not possible to protect against all such cases, of course, human nature being what it is.

Which brings us to the story behind this blog posting: the improper use of Generative AI to “make decisions” in ways that are impactful in the most damaging ways.

The starting point for this story: the state of Iowa in the United States is one of several states that have recently passed laws aimed at protecting young students from exposure to “inappropriate” materials in the school setting.

Senate File 496 includes limitations on school and classroom library collections, requiring that every book available to students be “age appropriate” and free of any “descriptions or visual depictions of a sex act” according to Iowa Code 702.17.

The Gazette

The Gazette (a daily newspaper in Cedar Rapids, Iowa) has the story of a school district in its area that has chosen to use AI (Machine Learning) to determine which books may run afoul of this new law.

Their reasoning for using AI? “Assistant Superintendent of Curriculum and Instruction Bridgette Exman told The Gazette that it was “simply not feasible to read every book…”

Sounds reasonable, right?

Well, the school district chose to generate the list of proscribed books by “feeding it a list of proscribed books [provided from other sources]” and seeing if the resulting output list presented “any surprises” to a staff librarian.

See the problem here? As noted in a blog about the news story:

The district didn’t run every book through the process, only the “commonly challenged” ones; if the end result was a list of commonly challenged books and no books that aren’t commonly challenged, well, there you go.

Daily Kos

It appears that people who don’t understand how to use Machine Learning misused it–GIGO?–and now have a trained AI that they think will allow them to filter out inappropriate books without having a human read and judge them.

Regardless of whether or not any of the titles do or do not contain said content, ChatGPT’s varying responses highlight troubling deficiencies of accuracy, analysis, and consistency. A repeat inquiry regarding The Kite Runner, for example, gives contradictory answers. In one response, ChatGPT deems Khaled Hosseini’s novel to contain “little to no explicit sexual content.” Upon a separate follow-up, the [Large Language Model] affirms the book “does contain a description of a sexual assault.”

Popular Science

This misuse of AI/ML is not uncommon–we’ve seen cases where law enforcement has trained facial recognition programs in a way which creates serious racial bias, for instance.

We, as IT professionals, need to aware of and on the lookout for such misuses, as we are in the best position to spot such situations and understand how to avoid them.