I'll cut right to the chase: the word "data" is plural. It's the plural form of Latin word "datum." Many data. One datum.

Datum means "a given" or "something that should be taken into consideration."

Nobody really uses the word "datum." Instead people say "data point" to represent a single unit of data.

So yes – when developers and data scientists say: "the data suggest these two phenomena are correlated" instead of "suggests" – they are indeed using the correct subject-verb agreement for the word "data".

This is not a matter of being cheeky. There is a single objective reality we all live in. And in that reality, the word "data" is plural.

But it's OK for people to say "the data is" instead of "the data are"

It really is.

Life is too short for this kind of nitpicking.

I often hear people who work with data use the wrong subject-verb agreement for the word data. I don't correct them. Because I don't want them to dismiss me as pedantic. Or to take this as some form of deeply-felt criticism.

And I recommend you don't correct people on this, either. It's just not that big a deal.

But remember, there 3 things that can instantly make anyone sound smarter:

  • Black frame glasses
  • A white lab coat
  • Saying "data are" instead of "data is"

Happy coding.