Bolsonaro recua e faz deputados passarem por mentirosos

A contar pelo que dizem doze parlamentares, entre eles os líderes de Patriota, Novo, Cidadania (ex-PPS), PSL e PSC, o presidente Jair Bolsonaro ontem pegou o telefone, ligou para o ministro da…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Loading data into Apache Ignite

The rest of this piece shows a number of ways that you can achieve this. One caveat I want to share before we get started: you probably don’t want to use any of these solutions in a production environment. These are mostly for experimentation. Maybe you want to play with another feature and need to quickly load in some reference data (how about taking a look at the machine learning APIs?). Or perhaps you want to try load in some data so you can ping it with one of the third-party integrations?

Probably the simplest method is to use Ignite’s SQL interface.

Sadly, that command doesn’t have any options or levers to tweak if your data isn’t in exactly the right format.

As any savvy Unix user will tell you, you can do wonders with a few commands in the shell. If you know the magic incantations.

You can do simple add/remove quotes using sed or your favourite text editor. You can easily remove columns (or reorder them) using cut.

But what if your data file is not anything like CSV?

If your file is in JSON format, you can convert into CSV with something like the following:

Once your command line gets to a point where it looks like line noise, I usually decide that I want a simpler solution.

Out of the box, Spark doesn’t know about Ignite so we have to start it with a couple of extra parameters:

Next, load the data into a Spark DataFrame:

The nice thing about this is that Spark reads and understands the structure of the JSON:

You can define the schema if it’s got anything wrong (in this case, it’s taken the time column to be a string rather than a timestamp). Depending on your level of experience, you may prefer to fix these errors in Spark or later on in Ignite; whatever works.

You can do any manipulation you like at this point — adding or removing columns, computing new values, changing data types, filtering records — and then write it out to Ignite:

And that’s it. No manipulation using sed or awk, just read in JSON, write to Ignite, complete with a schema. Nice.

In conclusion, the easiest way to get data into Ignite without coding is to have the data already in CSV format. But failing that, with a combination of Unix shell tools and higher level tools like Spark you can go a long way without writing a single line of Java.

Add a comment

Related posts:

No Vegan Viking

The newest revisionist trend claims that Vikings were either vegan or vegetarian, or had an oat, rather than meat, fish and dairy, based diet. The only back up for such assertion is a past unearthing…

Introducing block3 Academy

What is the block3 Academy? The block3 Academy makes learning about blockchain enjoyable and accessible. We want to make blockchain technology easy to understand for everyone, because we know some…

Does Your Gyno Provide Pasties?

Visits to the gyno can be uncomfortable and embarrassing. Aimée Gramblin writes about her most recent visit which went from awkward to lol funny.