<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/"
    xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
    <channel>
        
        <title>
            <![CDATA[ Neo4j - freeCodeCamp.org ]]>
        </title>
        <description>
            <![CDATA[ Browse thousands of programming tutorials written by experts. Learn Web Development, Data Science, DevOps, Security, and get developer career advice. ]]>
        </description>
        <link>https://www.freecodecamp.org/news/</link>
        <image>
            <url>https://cdn.freecodecamp.org/universal/favicons/favicon.png</url>
            <title>
                <![CDATA[ Neo4j - freeCodeCamp.org ]]>
            </title>
            <link>https://www.freecodecamp.org/news/</link>
        </image>
        <generator>Eleventy</generator>
        <lastBuildDate>Fri, 08 May 2026 22:32:38 +0000</lastBuildDate>
        <atom:link href="https://www.freecodecamp.org/news/tag/neo4j/rss.xml" rel="self" type="application/rss+xml" />
        <ttl>60</ttl>
        
            <item>
                <title>
                    <![CDATA[ Learn to Build Graph Databases with Neo4j (Full Course) ]]>
                </title>
                <description>
                    <![CDATA[ Neo4j is revolutionizing the way we handle complex relationships between data points. Its intuitive graph-based structure provides a flexible and efficient solution for various applications. We just published a Neo4j course on the freeCodeCamp.org Yo... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/learn-neo4j-database-course/</link>
                <guid isPermaLink="false">66b204c9a8b92c9329236499</guid>
                
                    <category>
                        <![CDATA[ Neo4j ]]>
                    </category>
                
                    <category>
                        <![CDATA[ youtube ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ Beau Carnes ]]>
                </dc:creator>
                <pubDate>Thu, 01 Jun 2023 16:49:15 +0000</pubDate>
                <media:content url="https://www.freecodecamp.org/news/content/images/2023/06/neo4j.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>Neo4j is revolutionizing the way we handle complex relationships between data points. Its intuitive graph-based structure provides a flexible and efficient solution for various applications.</p>
<p>We just published a Neo4j course on the freeCodeCamp.org YouTube channel. Whether you're a developer, a data scientist, or an aspiring technology enthusiast, this course is designed to equip you with the knowledge and skills needed to harness the full potential of Neo4j.</p>
<p>The course is taught by freeCodeCamp team members Farhan Chowdhury and Gavin Lon. They will teach you the basics of Neo4j and how to integrate it into real-world applications.</p>
<p>The course begins with a comprehensive introduction to Neo4j and graph database management systems. You'll learn how incorporating Neo4j into your applications can bring numerous benefits, such as improved performance, simplified querying, and enhanced data modeling capabilities. By understanding the fundamentals, you'll be well-prepared to dive deeper into the practical aspects of using Neo4j.</p>
<p>One of the highlights of this course is the hands-on project that guides you through building a real-world application using Java and Spring Boot. You'll discover how to leverage Neo4j as the backend storage for your application, enabling you to effectively model and manage relationships between data entities. From creating the initial database and connecting to it, to implementing courses, lessons, users, and authentication, you'll gain invaluable experience in building a robust application powered by Neo4j.</p>
<p>But that's not all! The course takes a holistic approach to application development by also covering the frontend implementation. You'll learn how to create a dynamic user interface using React to interact with the data stored in Neo4j. By combining the power of Neo4j's graph database with a modern frontend framework like React, you'll have the tools to create cutting-edge applications that excel in performance and usability.</p>
<p>Neo4j provided a grant to make this course possible. Their support has enabled us to bring you this comprehensive and immersive learning experience, empowering you to leverage the full potential of graph databases.</p>
<p>To fully benefit from this course, it is recommended that you have some basic knowledge of databases and programming. Familiarity with Java, Spring Boot, React, and JavaScript will also be advantageous.</p>
<p>So if you are ready to start learning about this powerful graph database system, watch the full course on the <a target="_blank" href="https://www.youtube.com/watch?v=_IgbB24scLI">freeCodeCamp.org YouTube channel</a> (5-hour watch).</p>
<div class="embed-wrapper">
        <iframe width="560" height="315" src="https://www.youtube.com/embed/_IgbB24scLI" style="aspect-ratio: 16 / 9; width: 100%; height: auto;" title="YouTube video player" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen="" loading="lazy"></iframe></div>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to produce and consume data streams directly via Cypher with Streams Procedures ]]>
                </title>
                <description>
                    <![CDATA[ By Andrea Santurbano Leveraging Neo4j Streams — Part 3 This article is the third part of the Leveraging Neo4j Streams series (Part 1 is here, Part 2 is here). In it, I’ll show you how to bring Neo4j into your Apache Kafka flow by using the streams pr... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-produce-and-consume-data-streams-directly-via-cypher-with-streams-procedures-52cbc5f543f1/</link>
                <guid isPermaLink="false">66c353fd5f85c1948b3fabaf</guid>
                
                    <category>
                        <![CDATA[ Apache Kafka ]]>
                    </category>
                
                    <category>
                        <![CDATA[ data ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Neo4j ]]>
                    </category>
                
                    <category>
                        <![CDATA[ streaming ]]>
                    </category>
                
                    <category>
                        <![CDATA[ tech  ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Thu, 09 May 2019 17:13:07 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*47Ktwi-Gdj5S7keZpiteZA.gif" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Andrea Santurbano</p>
<h4 id="heading-leveraging-neo4j-streams-part-3">Leveraging Neo4j Streams — Part 3</h4>
<p>This article is the third part of the <strong>Leveraging Neo4j Streams</strong> series (Part 1 is <a target="_blank" href="https://medium.freecodecamp.org/how-to-leverage-neo4j-streams-and-build-a-just-in-time-data-warehouse-64adf290f093">here</a>, Part 2 is <a target="_blank" href="https://medium.freecodecamp.org/how-to-ingest-data-into-neo4j-from-a-kafka-stream-a34f574f5655">here</a>). In it, I’ll show you how to bring Neo4j into your <strong>Apache Kafka</strong> flow by using the streams procedures available with <a target="_blank" href="https://medium.com/neo4j/a-new-neo4j-integration-with-apache-kafka-6099c14851d2"><strong>Neo4j Streams</strong></a>.</p>
<p>In order to show how to integrate them, simplify the integration, and let you test the whole project by hand, I’ll use <a target="_blank" href="https://towardsdatascience.com/building-a-graph-data-pipeline-with-zeppelin-spark-and-neo4j-8b6b83f4fb70"><strong>Apache Zeppelin</strong></a><strong>, a notebook runner that simply allows you to <a target="_blank" href="https://towardsdatascience.com/building-a-graph-data-pipeline-with-zeppelin-spark-and-neo4j-8b6b83f4fb70">natively interact with Neo4j</a>.</strong></p>
<h3 id="heading-what-is-a-neo4j-stored-procedure">What is a Neo4j Stored Procedure?</h3>
<p>Starting from Neo4j 3.x, the concept of <a target="_blank" href="https://neo4j.com/docs/java-reference/current/extending-neo4j/procedures/"><strong>user-defined procedures and functions</strong></a> was introduced. These are custom implementations of certain functionalities and/or business rules that can’t be (easily) expressed in Cypher itself.</p>
<p>Neo4j provides a number of built-in procedures. The <a target="_blank" href="http://neo4j-contrib.github.io/neo4j-apoc-procedures/">APOC</a> library adds another 450 to cover all kinds of uses from data integration to graph refactorings.</p>
<h3 id="heading-what-are-the-streams-procedures">What are the streams procedures?</h3>
<p>The Neo4j Streams project comes out with two procedures:</p>
<ul>
<li><code>streams.publish</code>: allows custom message streaming from Neo4j to the configured environment by using the underlying configured Producer</li>
<li><code>streams.consume</code>: allows consuming messages from a given topic.</li>
</ul>
<h3 id="heading-set-up-the-environment">Set-Up the Environment</h3>
<p>Going to the following <a target="_blank" href="https://github.com/conker84/leveraging-neo4j-streams">Github repo</a>, you’ll find everything necessary in order to replicate what I’m presenting in this article. What you will need to start is <a target="_blank" href="https://docs.docker.com/"><strong>Docker</strong></a>, and then you can simply spin-up the stack by entering into the directory and from the Terminal execute the following command:</p>
<pre><code>$ docker-compose up
</code></pre><p>This will start-up the whole environment that comprises:</p>
<ul>
<li>Neo4j + Neo4j Streams module + APOC procedures</li>
<li>Apache Kafka</li>
<li>Apache Spark (which is not necessary in this article, but it’s used in the previous two)</li>
<li>Apache Zeppelin</li>
</ul>
<p>By going into Apache Zeppelin @ <code>http://localhost:8080</code> you’ll find in directory <code>Medium/Part 3</code> one notebook called “<strong>Streams Procedures</strong>” which is the subject of this article.</p>
<h3 id="heading-streamspublish">streams.publish</h3>
<p>This procedure allows custom message streaming from Neo4j to the configured environment by using the underlying configured Producer.</p>
<p>It takes two variables as input and returns nothing (as it sends its payload asynchronously to the stream):</p>
<ul>
<li><strong>topic</strong>, <em>type String</em>: where the data will be published</li>
<li><strong>payload</strong>, <em>type Object</em>: what you want to stream.</li>
</ul>
<p>Example:</p>
<pre><code>CALL streams.publish(<span class="hljs-string">'my-topic'</span>, <span class="hljs-string">'Hello World from Neo4j!'</span>)
</code></pre><p>The message retrieved from the Consumer is the following:</p>
<pre><code>{<span class="hljs-string">"payload"</span>: <span class="hljs-string">"Hello world from Neo4j!"</span>}
</code></pre><p>You can send any kind of data in the payload: <strong>nodes, relationships, paths, lists, maps, scalar values and nested versions thereof</strong>.</p>
<p>In case of nodes and/or relationships, if the topic is defined in the patterns provided by the Change Data Capture (CDC) configuration, their properties will be filtered according to the configuration.</p>
<p>Following is a simple video that shows the procedure in action:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/jmaPyKRDXsCEdwZEdeMzNyE2BoKXolGMEbjR" alt="Image" width="600" height="400" loading="lazy">
<em>The streams.publish procedure in action</em></p>
<h3 id="heading-streamsconsume">streams.consume</h3>
<p>This procedure allows for consuming messages from a given topic.</p>
<p>It takes two variables as input:</p>
<ul>
<li><strong>topic</strong>, <em>type String</em>: where you want to consume the data</li>
<li><strong>config</strong>, _type Map: the configuration parameters</li>
</ul>
<p>and returns a list of collected events.</p>
<p>The <strong>config</strong> params are:</p>
<ul>
<li><strong>timeout</strong>, <em>type Long</em>: it’s the value passed to Kafka <code>[Consumer#poll](https://kafka.apache.org/10/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#poll-long-)</code> method (milliseconds). Default 1000.</li>
<li><strong>from</strong>, <em>type String</em>: it’s the Kafka configuration parameter <code>auto.offset.reset</code></li>
</ul>
<p>Use:</p>
<pre><code>CALL streams.consume(<span class="hljs-string">'my-topic'</span>, {<span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">config</span>&gt;</span>}) YIELD event RETURN event</span>
</code></pre><p>Example: Imagine you have a producer that publishes events like this:</p>
<pre><code>{<span class="hljs-string">"name"</span>: <span class="hljs-string">"Andrea"</span>, <span class="hljs-string">"surname"</span>: <span class="hljs-string">"Santurbano"</span>}
</code></pre><p>We can create user nodes in this way:</p>
<pre><code>CALL streams.consume(<span class="hljs-string">'my-topic'</span>, {<span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">config</span>&gt;</span>}) YIELD eventCREATE (p:Person{firstName: event.data.name, lastName: event.data.surname})</span>
</code></pre><p>Following is a simple video that shows the procedure in action:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/m0Lui2cBqiT0OQO9DTuiQfOYHZmpAKwRB4Ys" alt="Image" width="600" height="400" loading="lazy">
<em>The stream.consume procedure in action</em></p>
<p>So this is the end of the “Leveraging Neo4j Streams” series, I hope you enjoyed it!</p>
<p>If you have already tested the Neo4j-Streams module or tested it via this notebook, please fill out our <a target="_blank" href="https://goo.gl/forms/VLwvqwsIvdfdm9fL2"><strong>feedback survey</strong></a>.</p>
<p>If you run into any issues or have thoughts about improving our work, <a target="_blank" href="http://github.com/neo4j-contrib/neo4j-streams/issues">please raise a GitHub issue</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to detect a user’s preferred color scheme in JavaScript ]]>
                </title>
                <description>
                    <![CDATA[ By Oskar Hane In recent versions of macOS (Mojave) and Windows 10, users have been able to enable a system level dark mode. This works well and is easy to detect for native applications. Websites have been the odd apps where it’s up to the website pu... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-detect-a-users-preferred-color-scheme-in-javascript-ec8ee514f1ef/</link>
                <guid isPermaLink="false">66c351ab0107ba195e79f70c</guid>
                
                    <category>
                        <![CDATA[ CSS ]]>
                    </category>
                
                    <category>
                        <![CDATA[ JavaScript ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Neo4j ]]>
                    </category>
                
                    <category>
                        <![CDATA[ General Programming ]]>
                    </category>
                
                    <category>
                        <![CDATA[ UX ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Mon, 18 Mar 2019 16:54:59 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*QPIhIZte1bW0DKQoLoXwxw.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Oskar Hane</p>
<p>In recent versions of macOS (Mojave) and Windows 10, users have been able to enable a system level dark mode. This works well and is easy to detect for native applications.</p>
<p>Websites have been the odd apps where it’s up to the website publisher to decide what color scheme the users should use. Some websites do offer theme support. For the users to switch, they have to find the configuration for it and manually update the settings for each individual website.</p>
<p>Would it be possible to have this detection done automatically and have websites present a theme that respects the user’s preference?</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*QPIhIZte1bW0DKQoLoXwxw.png" alt="Image" width="600" height="400" loading="lazy">
<em>Light vs Dark theme in Neo4j Browser</em></p>
<h3 id="heading-css-media-query-prefers-color-scheme-draft">CSS media query ‘<code>prefers-color-scheme'</code> draft</h3>
<p>There is a CSS media queries draft level 5 where <a target="_blank" href="https://drafts.csswg.org/mediaqueries-5/#descdef-media-prefers-color-scheme">prefers-color-scheme</a> is specified. It is meant to detect if the user has requested the system to use a light or dark color theme.</p>
<p>This sounds like something we can work with! We need to stay up to date with any changes to the draft, though, as it might change at any time since it’s just a… draft. The <code>prefers-color-scheme</code> query can have three different values: <code>light</code>, <code>dark</code>, and <code>no-preference</code>.</p>
<h3 id="heading-web-browser-support-as-of-march-2019">Web browser support as of March 2019</h3>
<p>The current browser support is <em>very</em> limited, and it’s not available in any of the stable releases of any vendor. We can only enjoy this in <a target="_blank" href="https://developer.apple.com/safari/technology-preview/">Safari Technology Preview of version 12.1</a> and in <a target="_blank" href="https://www.mozilla.org/en-US/firefox/67.0a1/releasenotes/">Firefox 67.0a1</a>. What’s great is that there are binaries that do support it, so we can work with it and try it out in web browsers. For current browser support, check out <a target="_blank" href="https://caniuse.com/#search=prefers-color-scheme">https://caniuse.com/#search=prefers-color-scheme</a>.</p>
<h3 id="heading-why-css-only-detection-isnt-enough">Why CSS only detection isn’t enough</h3>
<p>The common approach I’ve seen so far is to use a CSS only approach and override CSS rules for certain classes when a media query is matched.<br>Something like this:</p>
<pre><code class="lang-css"><span class="hljs-comment">/* global.css */</span>

<span class="hljs-selector-class">.themed</span> {
  <span class="hljs-attribute">display</span>: block;
  <span class="hljs-attribute">width</span>: <span class="hljs-number">10em</span>;
  <span class="hljs-attribute">height</span>: <span class="hljs-number">10em</span>;
  <span class="hljs-attribute">background</span>: black;
  <span class="hljs-attribute">color</span>: white;
}

<span class="hljs-keyword">@media</span> (<span class="hljs-attribute">prefers-color-scheme:</span> light) {
  <span class="hljs-selector-class">.themed</span> {
    <span class="hljs-attribute">background</span>: white;
    <span class="hljs-attribute">color</span>: black;
  }
}
</code></pre>
<p>As this works fine for many use cases, there are styling techniques that do not use CSS in a way like this. If <a target="_blank" href="https://www.styled-components.com">styled-components</a> is used for theming, for example, then a JS object is replaced when the theme is changed.</p>
<p>Having access to the preferred scheme is also useful for analytics and more predictable CSS overrides as well as more fine-grained control over which elements should be themed and not.</p>
<h3 id="heading-initial-js-approach">Initial JS approach</h3>
<p>I’ve learned in the past that you can do media query detection by setting the CSS <code>content</code> of an element to a value if a media query is matched. This is definitely a hack, but it works!</p>
<p>Something like this:</p>
<pre><code class="lang-css"><span class="hljs-comment">/* global.css */</span>

<span class="hljs-selector-tag">html</span> {
  <span class="hljs-attribute">content</span>: <span class="hljs-string">""</span>;
}

<span class="hljs-keyword">@media</span> (<span class="hljs-attribute">prefers-color-scheme:</span> light) {
  <span class="hljs-selector-tag">html</span> {
    <span class="hljs-attribute">content</span>: <span class="hljs-string">"light"</span>;
  }
}

<span class="hljs-keyword">@media</span> (<span class="hljs-attribute">prefers-color-scheme:</span> dark) {
  <span class="hljs-selector-tag">html</span> {
    <span class="hljs-attribute">content</span>: <span class="hljs-string">"dark"</span>;
  }
}
</code></pre>
<p>So when a user loads the CSS and the media query matches one of the above color schemes, the <code>content</code> property value of the <code>html</code> element is set to either ‘light’ or ‘dark’.</p>
<p>The question then is, how do we read the <code>content</code> value of the <code>html</code> element?</p>
<p>We can use <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/Window/getComputedStyle">window.getComputedStyle</a>, like this:</p>
<pre><code class="lang-js"><span class="hljs-keyword">const</span> value = <span class="hljs-built_in">window</span>
  .getComputedStyle(<span class="hljs-built_in">document</span>.documentElement)
  .getPropertyValue(<span class="hljs-string">'content'</span>)
  .replace(<span class="hljs-regexp">/"/g</span>, <span class="hljs-string">''</span>)

<span class="hljs-comment">// value is now "dark", "light" or empty string</span>
</code></pre>
<p>And this works fine! This approach is fine for a <strong>one-time read</strong>, but it’s not reactive and automatically updates when the user changes their system color scheme. To be updated, a page reload is needed (or have the above read done at an interval).</p>
<h3 id="heading-reactive-js-approach">Reactive JS approach</h3>
<p>How can we know when the user changes the system color scheme? Are there any events we can listen to?</p>
<p>Yes there are!</p>
<p>There is <a target="_blank" href="https://developer.mozilla.org/en-US/docs/Web/API/Window/matchMedia">window.matchMedia</a> available in <a target="_blank" href="https://caniuse.com/#feat=matchmedia">modern web browsers</a>.</p>
<p>What’s great with <code>matchMedia</code> is that we can attach a listener to it that will be called anytime the match changes.</p>
<p>The listener will be called with an object containing the information if the media query started matching or if it stopped matching. With this info, we can skip the CSS altogether and just work with JS.</p>
<pre><code class="lang-js"><span class="hljs-keyword">const</span> DARK = <span class="hljs-string">'(prefers-color-scheme: dark)'</span>
<span class="hljs-keyword">const</span> LIGHT = <span class="hljs-string">'(prefers-color-scheme: light)'</span>

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">changeWebsiteTheme</span>(<span class="hljs-params">scheme</span>) </span>{
  <span class="hljs-comment">// 'dark' or 'light' string is in scheme here</span>
  <span class="hljs-comment">// so the website theme can be updated</span>
}

<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">detectColorScheme</span>(<span class="hljs-params"></span>) </span>{
  <span class="hljs-keyword">if</span> (!<span class="hljs-built_in">window</span>.matchMedia) {
    <span class="hljs-keyword">return</span>
  }

  <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">listener</span>(<span class="hljs-params">{ matches, media }</span>) </span>{
    <span class="hljs-keyword">if</span> (!matches) {
      <span class="hljs-comment">// Not matching anymore = not interesting</span>
      <span class="hljs-keyword">return</span>
    }

    <span class="hljs-keyword">if</span> (media === DARK) {
      changeWebsiteTheme(<span class="hljs-string">'dark'</span>)
    } <span class="hljs-keyword">else</span> <span class="hljs-keyword">if</span> (media === LIGHT) {
      changeWebsiteTheme(<span class="hljs-string">'light'</span>)
    }
  }

  <span class="hljs-keyword">const</span> mqDark = <span class="hljs-built_in">window</span>.matchMedia(DARK)
  mqDark.addListener(listener)

  <span class="hljs-keyword">const</span> mqLight = <span class="hljs-built_in">window</span>.matchMedia(LIGHT)
  mqLight.addListener(listener)
}
</code></pre>
<p>This approach works really well in the supported web browsers and just opts out if <code>window.matchMedia</code> isn't supported.</p>
<h3 id="heading-react-hook">React hook</h3>
<p>Since we are using React in <a target="_blank" href="https://github.com/neo4j/neo4j-browser">neo4j-browser</a>, I wrote this as a custom React hook to make it easy to re-use in all of our apps and fit into the React system.</p>
<pre><code class="lang-js"><span class="hljs-comment">// useDetectColorScheme.js</span>
<span class="hljs-keyword">import</span> { useState, useEffect } <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>

<span class="hljs-comment">// Define available themes</span>
<span class="hljs-keyword">export</span> <span class="hljs-keyword">const</span> colorSchemes = {
  <span class="hljs-attr">DARK</span>: <span class="hljs-string">'(prefers-color-scheme: dark)'</span>,
  <span class="hljs-attr">LIGHT</span>: <span class="hljs-string">'(prefers-color-scheme: light)'</span>,
}

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">useDetectColorScheme</span>(<span class="hljs-params">defaultScheme = <span class="hljs-string">'light'</span></span>) </span>{
  <span class="hljs-comment">// Hook state</span>
  <span class="hljs-keyword">const</span> [scheme, setScheme] = useState(defaultScheme)

  useEffect(<span class="hljs-function">() =&gt;</span> {
    <span class="hljs-comment">// No support for detection</span>
    <span class="hljs-keyword">if</span> (!<span class="hljs-built_in">window</span>.matchMedia) {
      <span class="hljs-keyword">return</span>
    }

    <span class="hljs-comment">// The listener</span>
    <span class="hljs-keyword">const</span> listener = <span class="hljs-function">(<span class="hljs-params">e</span>) =&gt;</span> {
      <span class="hljs-comment">// No match = not interesting</span>
      <span class="hljs-keyword">if</span> (!e || !e.matches) {
        <span class="hljs-keyword">return</span>
      }

      <span class="hljs-comment">// Look for the matching color scheme</span>
      <span class="hljs-comment">// and update the hook state</span>
      <span class="hljs-keyword">const</span> schemeNames = <span class="hljs-built_in">Object</span>.keys(colorSchemes)
      <span class="hljs-keyword">for</span> (<span class="hljs-keyword">let</span> i = <span class="hljs-number">0</span>; i &lt; schemeNames.length; i++) {
        <span class="hljs-keyword">const</span> schemeName = schemeNames[i]

        <span class="hljs-keyword">if</span> (e.media === colorSchemes[schemeName]) {
          setScheme(schemeName.toLowerCase())
          <span class="hljs-keyword">break</span>
        }
      }
    }

    <span class="hljs-comment">// Loop through and setup listeners for the</span>
    <span class="hljs-comment">// media queries we want to monitor</span>
    <span class="hljs-keyword">let</span> activeMatches = []
    <span class="hljs-built_in">Object</span>.keys(colorSchemes).forEach(<span class="hljs-function">(<span class="hljs-params">schemeName</span>) =&gt;</span> {
      <span class="hljs-keyword">const</span> mq = <span class="hljs-built_in">window</span>.matchMedia(colorSchemes[schemeName])

      mq.addListener(listener)
      activeMatches.push(mq)
      listener(mq)
    })

    <span class="hljs-comment">// Remove listeners, no memory leaks</span>
    <span class="hljs-keyword">return</span> <span class="hljs-function">() =&gt;</span> {
      activeMatches.forEach(<span class="hljs-function">(<span class="hljs-params">mq</span>) =&gt;</span> mq.removeListener(listener))
      activeMatches = []
    }
    <span class="hljs-comment">// Run on first load of hook only</span>
  }, [])

  <span class="hljs-comment">// Return the current scheme from state</span>
  <span class="hljs-keyword">return</span> scheme
}
</code></pre>
<p>It’s a bit more code than in the first short proof-of-concept. We have better error detection and we also remove the event listeners when the hook un-mounts.</p>
<p>In our use case, the users can choose to override the autodetected scheme with something else (we offer an outlined theme for example, often used when doing presentations).</p>
<p>And then use it like this in the application layer:</p>
<pre><code class="lang-jsx"><span class="hljs-comment">// App.jsx</span>
<span class="hljs-keyword">import</span> React <span class="hljs-keyword">from</span> <span class="hljs-string">'react'</span>
<span class="hljs-keyword">import</span> ThemeProvider <span class="hljs-keyword">from</span> <span class="hljs-string">'./ThemeProvider'</span>
<span class="hljs-keyword">import</span> useDetectColorScheme <span class="hljs-keyword">from</span> <span class="hljs-string">'./useDetectColorScheme'</span>
<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">App</span>(<span class="hljs-params">{ configuredTheme, themeData, children }</span>) </span>{
  <span class="hljs-comment">// Detect scheme and have 'light' as the default</span>
  <span class="hljs-keyword">const</span> autoScheme = useDetectColorScheme(<span class="hljs-string">'light'</span>)

  <span class="hljs-comment">// Check if user want to override the auto detected scheme</span>
  <span class="hljs-keyword">const</span> scheme = configuredTheme === <span class="hljs-string">'auto'</span> ? autoScheme : configuredTheme

  <span class="hljs-comment">// Pass the theme data to a theme provider component</span>
  <span class="hljs-keyword">return</span> <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">ThemeProvider</span> <span class="hljs-attr">theme</span>=<span class="hljs-string">{themeData[scheme]}</span>&gt;</span>{children}<span class="hljs-tag">&lt;/<span class="hljs-name">ThemeProvider</span>&gt;</span></span>
}
</code></pre>
<p>The last part depends on how theming is made in your application. In the example above, the theme data object is passed into a context provider that makes this object available throughout the whole React application.</p>
<h3 id="heading-end-result">End result</h3>
<p>Here’s a gif with the end result, and as you can see, it’s instant.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*dp2Nj97f12YMhEXuUiybTA.gif" alt="Image" width="600" height="400" loading="lazy"></p>
<h3 id="heading-final-thoughts">Final thoughts</h3>
<p>This was a fun experiment made during a so-called “Lab Day” we have in the UX team at <a target="_blank" href="https://neo4j.com">Neo4j</a>. The early stages of the spec and (therefore) the lack of browser support does not justify this to make it into any product yet. But support might come sooner than later.</p>
<p>And besides, we do ship some Electron-based products and there is an <code>[electron.systemPreferences.isDarkMode()](https://github.com/electron/electron/blob/master/docs/api/system-preferences.md#systempreferencesisdarkmode-macos)</code> available there...</p>
<h3 id="heading-about-the-author">About the author</h3>
<p><a target="_blank" href="https://twitter.com/oskarhane">Oskar Hane</a> is a team lead / senior engineer at <a target="_blank" href="https://neo4j.com">Neo4j</a>.<br>He works on multiple of Neo4j:s end-user applications and code libraries and have authored two tech books.</p>
<p>Follow Oskar on twitter: <a target="_blank" href="https://twitter.com/oskarhane">@oskarhane</a></p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to ingest data into Neo4j from a Kafka stream ]]>
                </title>
                <description>
                    <![CDATA[ By Andrea Santurbano This article is the second part of the Leveraging Neo4j Streams series (Part 1 is here). I’ll show how to bring Neo4j into your Apache Kafka flow by using the Sink module of the Neo4j Streams project in combination with Apache Sp... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-ingest-data-into-neo4j-from-a-kafka-stream-a34f574f5655/</link>
                <guid isPermaLink="false">66c352d879660e79296c1dcf</guid>
                
                    <category>
                        <![CDATA[ Apache Kafka ]]>
                    </category>
                
                    <category>
                        <![CDATA[ #apache-spark ]]>
                    </category>
                
                    <category>
                        <![CDATA[ data ]]>
                    </category>
                
                    <category>
                        <![CDATA[ kafka ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Neo4j ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Fri, 15 Feb 2019 16:47:10 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*I3lIfJ7LFzRpfk0hdAbsww.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Andrea Santurbano</p>
<p>This article is the second part of the <strong>Leveraging Neo4j Streams</strong> series (<a target="_blank" href="https://medium.freecodecamp.org/how-to-leverage-neo4j-streams-and-build-a-just-in-time-data-warehouse-64adf290f093">Part 1 is here</a>). I’ll show how to bring Neo4j into your <strong>Apache Kafka</strong> flow by using the Sink module of the <a target="_blank" href="https://medium.com/neo4j/a-new-neo4j-integration-with-apache-kafka-6099c14851d2"><strong>Neo4j Streams</strong></a> project in combination with <strong>Apache Spark</strong>’s Structured Streaming Apis.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/ucRQaTumqnuCgJXKJTQfxR5pnsTeSxEQN0-k" alt="Image" width="600" height="400" loading="lazy">
_Photo by [Unsplash](https://unsplash.com/photos/-qrcOR33ErA?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText" rel="noopener" target="_blank" title=""&gt;Hendrik Cornelissen on &lt;a href="https://unsplash.com/search/photos/stream?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText" rel="noopener" target="<em>blank" title=")</em></p>
<p>In order to show how to integrate them, simplify the integration, and let you test the whole project yourself, I’ll use <a target="_blank" href="https://towardsdatascience.com/building-a-graph-data-pipeline-with-zeppelin-spark-and-neo4j-8b6b83f4fb70"><strong>Apache Zeppelin</strong></a> <strong>— a notebook runner that simply allows you to <a target="_blank" href="https://towardsdatascience.com/building-a-graph-data-pipeline-with-zeppelin-spark-and-neo4j-8b6b83f4fb70">natively interact with Neo4j</a>.</strong></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/btZPmk6Xpd650yTa-cU7FgXPvdlMMbEGrM7K" alt="Image" width="600" height="400" loading="lazy">
<em>The result</em></p>
<h3 id="heading-leveraging-neo4j-streams">Leveraging Neo4j Streams</h3>
<p>The Neo4j Streams project is composed of three main pillars:</p>
<ul>
<li>The <strong>Change Data Capture</strong> that allows you to stream database changes over Kafka topics</li>
<li>The <strong>Sink</strong> (the subject of the first article) that allows consuming data streams from Kafka topics</li>
<li>A <strong>set of procedures</strong> that allows you to Produce/Consume data to/from Kafka Topics</li>
</ul>
<h3 id="heading-the-neo4j-streams-sink">The Neo4j Streams Sink</h3>
<p>This module allows Neo4j to consume data from a Kafka topic. It does it in a “smart” way: by allowing you to define your custom queries. What you need to do is write in your neo4j.conf something like this:</p>
<pre><code>streams.sink.topic.cypher.&lt;TOPIC&gt;=<span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">CYPHER_QUERY</span>&gt;</span></span>
</code></pre><p>So if you define a query just like this:</p>
<pre><code>streams.sink.topic.my-topic=MERGE (n:Person{<span class="hljs-attr">id</span>: event.id}) \
    ON CREATE SET n += event.properties
</code></pre><p>And for events like this:</p>
<pre><code>{<span class="hljs-attr">id</span>:<span class="hljs-string">"alice@example.com"</span>,<span class="hljs-attr">properties</span>:{<span class="hljs-attr">name</span>:<span class="hljs-string">"Alice"</span>,<span class="hljs-attr">age</span>:<span class="hljs-number">32</span>}}
</code></pre><p>Under the hood the Sink module will execute a query like this:</p>
<pre><code>UNWIND {batch} AS event
MERGE (n:Label {<span class="hljs-attr">id</span>: event.id})
    ON CREATE SET n += event.properties
</code></pre><p>The <code>batch</code> parameter is a set of Kafka events that are gathered from the SINK and processed in a single transaction in order to maximize the execution efficiency.</p>
<p>So continuing with the example above, a possible full representation could be:</p>
<pre><code>WITH [{<span class="hljs-attr">id</span>:<span class="hljs-string">"alice@example.com"</span>,<span class="hljs-attr">properties</span>:{<span class="hljs-attr">name</span>:<span class="hljs-string">"Alice"</span>,<span class="hljs-attr">age</span>:<span class="hljs-number">32</span>}},
    {<span class="hljs-attr">id</span>:<span class="hljs-string">"bob@example.com"</span>,<span class="hljs-attr">properties</span>:{<span class="hljs-attr">name</span>:<span class="hljs-string">"Bob"</span>,<span class="hljs-attr">age</span>:<span class="hljs-number">42</span>}}] AS batch
UNWIND batch AS event
MERGE (n:Person {<span class="hljs-attr">id</span>: event.id})
    ON CREATE SET n += event.properties
</code></pre><p>This gives to the developer the power to define their own business rules because you can choose to update, add to, remove, or adapt your graph data based on the events you get.</p>
<h3 id="heading-a-simple-use-case-ingest-data-from-open-data-apis">A simple use case: Ingest data from Open Data APIs</h3>
<p>Imagine your data pipeline needs to read data from an Open Data API, enrich it with some other internal source, and in the end persist it into Neo4j. In this case, the best solution for doing this is using Apache Spark. This easily allows managing different data sources with the same Dataset abstraction.</p>
<h4 id="heading-set-up-the-environment">Set-Up the Environment</h4>
<p>Going to the following <a target="_blank" href="https://github.com/conker84/leveraging-neo4j-streams">Github repo</a>, you’ll find the whole code necessary in order to replicate what I’m presenting in this article. What you will need to start is <a target="_blank" href="https://docs.docker.com/"><strong>Docker</strong></a>, and then you can simply spin up the stack by entering the directory and executing the following command from the terminal:</p>
<pre><code>$ docker-compose up
</code></pre><p>This will start up the whole environment that comprises:</p>
<ul>
<li>Neo4j + Neo4j Streams module + APOC procedures</li>
<li>Apache Kafka</li>
<li>Apache Spark</li>
<li>Apache Zeppelin</li>
</ul>
<p><img src="https://cdn-media-1.freecodecamp.org/images/lMf2OG4Sw0iv1hVUHehYkcF9b0xFk5H9I8Qe" alt="Image" width="600" height="400" loading="lazy">
<em>The whole architecture based on Docker containers</em></p>
<p>By going into Apache Zeppelin @ <code>http://localhost:8080</code> you’ll find in the directory <code>Medium/Part 2</code> one notebook “<strong>From Open Data to Sink</strong>” which is the subject of this article.</p>
<h4 id="heading-the-open-data-api">The Open Data API</h4>
<p>We’ll choose the Italian Ministry of Health dataset of Pharmacy stores.</p>
<h4 id="heading-define-the-sink-query">Define the Sink Query</h4>
<p>If you go into the <code>[d](http://localhost:8080)ocker-compose.yml</code> file you’ll find a new property that corresponds to the Sink query definition:</p>
<pre><code>NEO4J_streams_sink_topic_cypher_pharma: <span class="hljs-string">"
MERGE (p:Pharmacy{fiscalId: event.FISCAL_ID}) ON CREATE SET p.name = event.NAME
MERGE (t:PharmacyType{type: event.TYPE_NAME})
MERGE (a:Address{name: event.ADDRESS + ', ' + event.CITY})
  ON CREATE SET a.latitude = event.LATITUDE,
                a.longitude = event.LONGITUDE,
                a.code = event.POSTAL_CODE,
                a.point = event.POINT
MERGE (c:City{name: event.CITY})
MERGE (p)-[:IS_TYPE]-(t)
MERGE (p)-[:HAS_ADDRESS]-(a)
MERGE (a)-[:IS_LOCATED_IN]-&gt;(c)"</span>
</code></pre><p>The <code>NEO4J_streams_sink_topic_cypher_pharma</code> property defines that all the data that comes from a topic named <code>pharma</code> will be consumed with the corresponding query.</p>
<p>The graph model that results from the query above is:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/mMpAsz0co84uukQ95JDiNSWYQvnMrM7PcYsk" alt="Image" width="600" height="400" loading="lazy">
<em>Our data model</em></p>
<h4 id="heading-the-notebook-from-open-data-to-sink">The Notebook — <strong>From Open Data to Sink</strong></h4>
<p>The first step is download the CSV from the Open Data Portal and load it into a Spark Dataframe:</p>
<pre><code>val fileUrl = z.input(<span class="hljs-string">"File Url"</span>).toString

val url = <span class="hljs-keyword">new</span> java.net.URL(fileUrl)
val localFilePath = s<span class="hljs-string">"/zeppelin/spark-warehouse/${url.getPath.split("</span>/<span class="hljs-string">").last}"</span>

val src = scala.io.Source.fromURL(fileUrl)(<span class="hljs-string">"ISO-8859-1"</span>)
val out = <span class="hljs-keyword">new</span> java.io.FileWriter(localFilePath)
out.write(src.mkString)
out.close

val csvDF = (spark.read
    .format(<span class="hljs-string">"csv"</span>)
    .option(<span class="hljs-string">"delimiter"</span>, <span class="hljs-string">";"</span>)
    .option(<span class="hljs-string">"header"</span>, <span class="hljs-string">"true"</span>)
    .load(localFilePath))
</code></pre><p>Now let’s explore the structure of the <code>csvDF</code>:</p>
<pre><code>root
|-- CODICEIDENTIFICATIVOFARMACIA: string (nullable = <span class="hljs-literal">true</span>)
|-- CODFARMACIAASSEGNATODAASL: string (nullable = <span class="hljs-literal">true</span>)
|-- INDIRIZZO: string (nullable = <span class="hljs-literal">true</span>)
|-- DESCRIZIONEFARMACIA: string (nullable = <span class="hljs-literal">true</span>)
|-- PARTITAIVA: string (nullable = <span class="hljs-literal">true</span>)
|-- CAP: string (nullable = <span class="hljs-literal">true</span>)
|-- CODICECOMUNEISTAT: string (nullable = <span class="hljs-literal">true</span>)
|-- DESCRIZIONECOMUNE: string (nullable = <span class="hljs-literal">true</span>)
|-- FRAZIONE: string (nullable = <span class="hljs-literal">true</span>)
|-- CODICEPROVINCIAISTAT: string (nullable = <span class="hljs-literal">true</span>)
|-- SIGLAPROVINCIA: string (nullable = <span class="hljs-literal">true</span>)
|-- DESCRIZIONEPROVINCIA: string (nullable = <span class="hljs-literal">true</span>)
|-- CODICEREGIONE: string (nullable = <span class="hljs-literal">true</span>)
|-- DESCRIZIONEREGIONE: string (nullable = <span class="hljs-literal">true</span>)
|-- DATAINIZIOVALIDITA: string (nullable = <span class="hljs-literal">true</span>)
|-- DATAFINEVALIDITA: string (nullable = <span class="hljs-literal">true</span>)
|-- DESCRIZIONETIPOLOGIA: string (nullable = <span class="hljs-literal">true</span>)
|-- CODICETIPOLOGIA: string (nullable = <span class="hljs-literal">true</span>)
|-- LATITUDINE: string (nullable = <span class="hljs-literal">true</span>)
|-- LONGITUDINE: string (nullable = <span class="hljs-literal">true</span>)
|-- LOCALIZE: string (nullable = <span class="hljs-literal">true</span>)
</code></pre><p>We want to focus on two fields:</p>
<ul>
<li><strong>CODICEIDENTIFICATIVOFARMACIA</strong>: it “should” be the unique identifier given by the Italian Ministry of Health to a Pharmacy Store</li>
<li><strong>DATAFINEVALIDITA</strong>: it indicates if the Pharmacy Store is still active (if it has no value it is active, otherwise it is closed)</li>
</ul>
<p>We now save the Dataframe into a Spark temp view called <code>OPEN_DATA</code>:</p>
<pre><code>csvDF.createOrReplaceTempView(<span class="hljs-string">"open_data"</span>)
</code></pre><p>Let’s now overwrite the <code>OPEN_DATA</code> temp view by filtering the dataset for valid records and renaming some fields:</p>
<pre><code>%sql
CREATE OR REPLACE TEMP VIEW OPEN_DATA AS
SELECT CODICEIDENTIFICATIVOFARMACIA AS PHARMA_ID,
 INDIRIZZO AS ADDRESS,
 DESCRIZIONEFARMACIA AS NAME,
 PARTITAIVA AS FISCAL_ID,
 CAP AS POSTAL_CODE,
 DESCRIZIONECOMUNE AS CITY,
 DESCRIZIONEPROVINCIA AS PROVINCE,
 DATAFINEVALIDITA,
 DESCRIZIONETIPOLOGIA AS TYPE_NAME,
 CODICETIPOLOGIA AS TYPE,
 REPLACE(LATITUDINE, ‘,’, ‘.’) AS LATITUDE,
 REPLACE(LONGITUDINE, ‘,’, ‘.’) AS LONGITUDE,
 REPLACE(LATITUDINE, ‘,’, ‘.’) || ‘,’ || REPLACE(LONGITUDINE, ‘,’, ‘.’) AS POINT
FROM OPEN_DATA
WHERE DATAFINEVALIDITA &lt;&gt; ‘-’
AND CODICEIDENTIFICATIVOFARMACIA &lt;&gt; ‘-’
</code></pre><p>Let’s now create the <code>OPEN_DATA_KAFKA_STAGE</code> temp view that must contain two columns:</p>
<ul>
<li><strong>VALUE</strong>: JSON that represents the data that we want to send to the Kafka topic</li>
<li><strong>KEY</strong>: a key that identifies the row</li>
</ul>
<p>You may notice that this is exactly the minimum requirement for a <code>ProducerRecord:</code></p>
<pre><code>%sql
CREATE OR REPLACE TEMP VIEW OPEN_DATA_KAFKA_STAGE AS
SELECT TO_JSON(
    STRUCT(PHARMA_ID,
        ADDRESS,
        NAME,
        FISCAL_ID,
        POSTAL_CODE,
        CITY,
        PROVINCE,
        TYPE_NAME,
        TYPE,
        LATITUDE,
        LONGITUDE,
        POINT)
    ) AS VALUE,
    PHARMA_ID AS KEY
FROM OPEN_DATA
</code></pre><p>Let’s now send the data to the <code>pharma</code> topic via spark:</p>
<pre><code>(spark.table(<span class="hljs-string">"OPEN_DATA_KAFKA_STAGE"</span>).selectExpr(<span class="hljs-string">"CAST(key AS STRING)"</span>, <span class="hljs-string">"CAST(value AS STRING)"</span>)
    .write
    .format(<span class="hljs-string">"kafka"</span>)
    .option(<span class="hljs-string">"kafka.enable.auto.commit"</span>, <span class="hljs-string">"true"</span>)
    .option(<span class="hljs-string">"kafka.bootstrap.servers"</span>, <span class="hljs-string">"broker:9093"</span>)
    .option(<span class="hljs-string">"topic"</span>, <span class="hljs-string">"pharma"</span>)
    .save())
</code></pre><p>The data streamed to the <code>pharma</code> topic via the spark job will now be consumed from the Neo4j Streams Sink module thanks to the Cypher template that we defined at the beginning of the article.</p>
<p>Now in the final paragraph, we can explore the ingested data. In the following video we are exploring all the Pharmacy stores located in Turin:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/w4WY8A3yV-wnxKngp167PaDrrlZxNnZKfwyZ" alt="Image" width="600" height="400" loading="lazy">
<em>Explore the data just ingested</em></p>
<h3 id="heading-wrapping-up">Wrapping up</h3>
<p>In this second article (<a target="_blank" href="https://medium.freecodecamp.org/how-to-leverage-neo4j-streams-and-build-a-just-in-time-data-warehouse-64adf290f093">please check the first one</a> if you haven’t already) we have seen how to use the SINK module in order to transform Apache Kafka events into arbitrary Graph Structures. You can do it in a very simple way by using the Apache Spark APIs.</p>
<p>In Part 3 we’ll discover how to use the Streams procedure in order to produce/consume data directly via Cypher queries, so please stay tuned!</p>
<p>If you have already tested the Neo4j-Streams module or tested it via this notebook please fill out our <a target="_blank" href="https://goo.gl/forms/VLwvqwsIvdfdm9fL2"><strong>feedback survey</strong></a>.</p>
<p>If you run into any issues or have thoughts about improving our work, <a target="_blank" href="http://github.com/neo4j-contrib/neo4j-streams/issues">please raise a GitHub issue</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to leverage Neo4j Streams and build a just-in-time data warehouse ]]>
                </title>
                <description>
                    <![CDATA[ By Andrea Santurbano In this article, we’ll show how to create a Just-In-Time Data Warehouse by using Neo4j and the Neo4j Streams module with Apache Spark’s Structured Streaming Apis and Apache Kafka. In order to show how to integrate them, simplify ... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-leverage-neo4j-streams-and-build-a-just-in-time-data-warehouse-64adf290f093/</link>
                <guid isPermaLink="false">66c3531b0107ba195e79f72a</guid>
                
                    <category>
                        <![CDATA[ Apache Kafka ]]>
                    </category>
                
                    <category>
                        <![CDATA[ kafka ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Neo4j ]]>
                    </category>
                
                    <category>
                        <![CDATA[ General Programming ]]>
                    </category>
                
                    <category>
                        <![CDATA[ streaming ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Tue, 29 Jan 2019 16:37:47 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*lwaAjWM8LuAvRZ1T67vWQw.jpeg" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Andrea Santurbano</p>
<p>In this article, we’ll show how to create a <a target="_blank" href="https://databricks.com/blog/2015/11/30/building-a-just-in-time-data-warehouse-platform-with-databricks.html">Just-In-Time Data Warehouse</a> by using <a target="_blank" href="https://neo4j.com/"><strong>Neo4j</strong></a> <strong>and the <a target="_blank" href="https://medium.com/neo4j/a-new-neo4j-integration-with-apache-kafka-6099c14851d2">Neo4j Streams</a></strong> module with <strong>Apache Spark</strong>’s Structured Streaming Apis and <strong>Apache Kafka.</strong></p>
<p>In order to show how to integrate them, simplify the integration, and let you test the whole project by hand, I’ll use <a target="_blank" href="https://towardsdatascience.com/building-a-graph-data-pipeline-with-zeppelin-spark-and-neo4j-8b6b83f4fb70"><strong>Apache Zeppelin</strong></a> <strong>a notebook runner that simply allows to <a target="_blank" href="https://towardsdatascience.com/building-a-graph-data-pipeline-with-zeppelin-spark-and-neo4j-8b6b83f4fb70">natively interact with Neo4j</a>.</strong></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/qrtYkmywS6MwhLmsVKauFIQHuuu2vKwNUmp7" alt="Image" width="600" height="400" loading="lazy">
<em>The final result: how a kafka event streamed by Neo4j gets collected by Apache Spark</em></p>
<h3 id="heading-leveraging-neo4j-streams">Leveraging Neo4j Streams</h3>
<p>The Neo4j Streams project is composed of three main pillars:</p>
<ul>
<li>The <strong>Change Data Capture</strong> (the subject of this first article) that allows us to stream database changes over Kafka topics</li>
<li>The <strong>Sink</strong> that allows consuming data streams from the Kafka topic</li>
<li>A <strong>set of procedures</strong> that allows us to Produce/Consume data to/from Kafka Topics</li>
</ul>
<h3 id="heading-what-is-a-change-data-capture">What is a Change Data Capture?</h3>
<p>It’s a system that automatically captures changes from a source system (a Database, for instance) and automatically provides these changes to downstream systems for a variety of use cases.</p>
<p>CDC typically forms part of an ETL pipeline. This is an important component for ensuring Data Warehouses (DWH) are kept up to date with any record changes.</p>
<p>Also traditionally CDC applications used to work off of transaction logs, thereby allowing us to replicate databases without having much of a performance impact on its operation.</p>
<h3 id="heading-how-does-the-neo4j-streams-cdc-module-deal-with-database-changes">How does the Neo4j Streams CDC module deal with database changes?</h3>
<p>Every transaction inside Neo4j gets captured and transformed in order to stream an atomic element of the transaction.</p>
<p>Let’s suppose we have a simple creation of two nodes and one relationship between them:</p>
<pre><code>CREATE (andrea:Person{<span class="hljs-attr">name</span>:<span class="hljs-string">"Andrea"</span>})-[knows:KNOWS{<span class="hljs-attr">since</span>:<span class="hljs-number">2014</span>}]-&amp;gt;(michael:Person{<span class="hljs-attr">name</span>:<span class="hljs-string">"Michael"</span>})
</code></pre><p>The CDC module will transform this transaction into 3 events (2 node creation, 1 relationship creation).</p>
<p>The Event structure was inspired by the <a target="_blank" href="https://debezium.io/">Debezium</a> format and has the following general structure:</p>
<pre><code>{  <span class="hljs-string">"meta"</span>: { <span class="hljs-comment">/* transaction meta-data */</span> },  <span class="hljs-string">"payload"</span>: { <span class="hljs-comment">/* the data related to the transaction */</span>    <span class="hljs-string">"before"</span>: { <span class="hljs-comment">/* the data before the transaction */</span>},    <span class="hljs-string">"after"</span>: { <span class="hljs-comment">/* the data after the transaction */</span>}  }}
</code></pre><p>Node source <code>(andrea)</code>:</p>
<pre><code>{  <span class="hljs-string">"meta"</span>: {    <span class="hljs-string">"timestamp"</span>: <span class="hljs-number">1532597182604</span>,    <span class="hljs-string">"username"</span>: <span class="hljs-string">"neo4j"</span>,    <span class="hljs-string">"tx_id"</span>: <span class="hljs-number">1</span>,    <span class="hljs-string">"tx_event_id"</span>: <span class="hljs-number">0</span>,    <span class="hljs-string">"tx_events_count"</span>: <span class="hljs-number">3</span>,    <span class="hljs-string">"operation"</span>: <span class="hljs-string">"created"</span>,    <span class="hljs-string">"source"</span>: {      <span class="hljs-string">"hostname"</span>: <span class="hljs-string">"neo4j.mycompany.com"</span>    }  },  <span class="hljs-string">"payload"</span>: {    <span class="hljs-string">"id"</span>: <span class="hljs-string">"1004"</span>,    <span class="hljs-string">"type"</span>: <span class="hljs-string">"node"</span>,    <span class="hljs-string">"after"</span>: {      <span class="hljs-string">"labels"</span>: [<span class="hljs-string">"Person"</span>],      <span class="hljs-string">"properties"</span>: {        <span class="hljs-string">"name"</span>: <span class="hljs-string">"Andrea"</span>      }    }  }}
</code></pre><p>Node target <code>(michael)</code>:</p>
<pre><code>{  <span class="hljs-string">"meta"</span>: {    <span class="hljs-string">"timestamp"</span>: <span class="hljs-number">1532597182604</span>,    <span class="hljs-string">"username"</span>: <span class="hljs-string">"neo4j"</span>,    <span class="hljs-string">"tx_id"</span>: <span class="hljs-number">1</span>,    <span class="hljs-string">"tx_event_id"</span>: <span class="hljs-number">1</span>,    <span class="hljs-string">"tx_events_count"</span>: <span class="hljs-number">3</span>,    <span class="hljs-string">"operation"</span>: <span class="hljs-string">"created"</span>,    <span class="hljs-string">"source"</span>: {      <span class="hljs-string">"hostname"</span>: <span class="hljs-string">"neo4j.mycompany.com"</span>    }  },  <span class="hljs-string">"payload"</span>: {    <span class="hljs-string">"id"</span>: <span class="hljs-string">"1006"</span>,    <span class="hljs-string">"type"</span>: <span class="hljs-string">"node"</span>,    <span class="hljs-string">"after"</span>: {      <span class="hljs-string">"labels"</span>: [<span class="hljs-string">"Person"</span>],      <span class="hljs-string">"properties"</span>: {        <span class="hljs-string">"name"</span>: <span class="hljs-string">"Michael"</span>      }    }  }}
</code></pre><p>Relationship <code>knows</code>:</p>
<pre><code>{  <span class="hljs-string">"meta"</span>: {    <span class="hljs-string">"timestamp"</span>: <span class="hljs-number">1532597182604</span>,    <span class="hljs-string">"username"</span>: <span class="hljs-string">"neo4j"</span>,    <span class="hljs-string">"tx_id"</span>: <span class="hljs-number">1</span>,    <span class="hljs-string">"tx_event_id"</span>: <span class="hljs-number">2</span>,    <span class="hljs-string">"tx_events_count"</span>: <span class="hljs-number">3</span>,    <span class="hljs-string">"operation"</span>: <span class="hljs-string">"created"</span>,    <span class="hljs-string">"source"</span>: {      <span class="hljs-string">"hostname"</span>: <span class="hljs-string">"neo4j.mycompany.com"</span>    }  },  <span class="hljs-string">"payload"</span>: {    <span class="hljs-string">"id"</span>: <span class="hljs-string">"1007"</span>,    <span class="hljs-string">"type"</span>: <span class="hljs-string">"relationship"</span>,    <span class="hljs-string">"label"</span>: <span class="hljs-string">"KNOWS"</span>,    <span class="hljs-string">"start"</span>: {      <span class="hljs-string">"labels"</span>: [<span class="hljs-string">"Person"</span>],      <span class="hljs-string">"id"</span>: <span class="hljs-string">"1005"</span>    },    <span class="hljs-string">"end"</span>: {      <span class="hljs-string">"labels"</span>: [<span class="hljs-string">"Person"</span>],      <span class="hljs-string">"id"</span>: <span class="hljs-string">"106"</span>    },    <span class="hljs-string">"after"</span>: {      <span class="hljs-string">"properties"</span>: {        <span class="hljs-string">"since"</span>: <span class="hljs-number">2014</span>      }    }  }}
</code></pre><p>By default, all the data will be streamed on the <code>neo4j</code> topic. The CDC module allows controlling which nodes are sent to Kafka, and which of their properties you want to send to the topic:</p>
<pre><code>streams.source.topic.nodes.&lt;TOPIC_NAME&gt;=<span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">PATTERN</span>&gt;</span></span>
</code></pre><p>With the following example:</p>
<pre><code>streams.source.topic.nodes.products=Product{name, code}
</code></pre><p>The CDC module will send to the <code>products</code> topic all the nodes that have the label <code>Product</code>. It then sends, to that topic, only the changes about <code>name</code> and <code>code</code> properties. Please go the official documentation for a full description on <a target="_blank" href="https://neo4j-contrib.github.io/neo4j-streams/#_patterns">how label filtering works</a>.</p>
<p>For a more in-depth description of the Neo4j Streams project and how/why we at <a target="_blank" href="http://www.larus-ba.it/"><strong>LARUS</strong></a> and <a target="_blank" href="https://neo4j.com/"><strong>Neo4j</strong></a> built it, check out this article that provides an <a target="_blank" href="https://medium.com/neo4j/a-new-neo4j-integration-with-apache-kafka-6099c14851d2">in-depth description</a>.</p>
<h3 id="heading-beyond-the-traditional-data-warehouse">Beyond the traditional Data Warehouse</h3>
<p>A traditional DWH requires data teams to constantly build multiple costly and time-consuming Extract Transform Load (ETL) pipelines to ultimately derive business insights.</p>
<p>One of the biggest pain points is that, due to its <strong>rigid architecture that’s difficult to change</strong>, Enterprise Data Warehouses are <strong>inherently rigid.</strong> That’s because:</p>
<ul>
<li>they are <strong>based on the</strong> <strong>Schema-On-Write architecture:</strong> first, you define your schema, then you write your data, then you read your data and it comes back in the schema you defined up-front</li>
<li>they are <strong>based</strong> on (expensive) <strong>batched/scheduled jobs</strong></li>
</ul>
<p><strong>This results in having to build costly and time-consuming ETL pipelines</strong> to access and manipulate the data. And as <strong>new data types</strong> and sources are introduced, the need to augment your ETL pipelines <strong>exacerbates the problem</strong>.</p>
<p>Thanks to the <strong>combination</strong> of the stream data processing with the <strong>Neo4j Streams CDC module</strong> and the <strong>Schema-On-Read</strong> approach provided by Apache Spark, we can <strong>overcome this rigidity</strong> and build a new kind of (flexible) DWH.</p>
<h3 id="heading-a-paradigm-shift-just-in-time-data-warehouse">A paradigm shift: Just-In-Time Data Warehouse</h3>
<p>A JIT-DWH solution is designed to easily handle a wider variety of data from different sources and starts from a different approach about how to deal with and manage data: <strong>Schema-On-Read.</strong></p>
<h3 id="heading-schema-on-read">Schema-On-Read</h3>
<p><a target="_blank" href="https://www.marklogic.com/blog/schema-on-read-vs-schema-on-write/">Schema-On-Read</a> follows a different sequence: <strong>it just loads the data as-is and applies your own lens to the data when you read it back out</strong>. With this kind of approach, you can present data in a schema that is adapted best to the queries being issued. You’re not stuck with a one-size-fits-all schema. With schema-on-read, you can present the data back in a schema that is most relevant to the task at hand.</p>
<h4 id="heading-set-up-the-environment">Set-Up the Environment</h4>
<p>Going to the following <a target="_blank" href="https://github.com/conker84/leveraging-neo4j-streams"><strong>Github repo</strong></a> you’ll find everything you need in order to replicate what I’m presenting in this article. What you will need to start is <a target="_blank" href="https://docs.docker.com/"><strong>Docker</strong></a><strong>.</strong> Then you can simply spin-up the stack by entering into the directory and from the Terminal, executing the following command:</p>
<pre><code>$ docker-compose up
</code></pre><p>This will start-up the whole environment that comprises:</p>
<ul>
<li>Neo4j + Neo4j Streams module + APOC procedures</li>
<li>Apache Kafka</li>
<li>Apache Spark</li>
<li>Apache Zeppelin</li>
</ul>
<p><img src="https://cdn-media-1.freecodecamp.org/images/j4n2GGDDTdZZFuoNuyP9eioEs8C4aAk5hfg0" alt="Image" width="600" height="400" loading="lazy">
<em>The whole architecture based on Docker containers</em></p>
<p>By going into Apache Zeppelin @ <code>http://localhost:8080</code> you’ll find in the directory <code>Medium/Part 1</code> two notebooks:</p>
<ul>
<li><strong>Create a Just-In-Time Data Warehouse</strong>: in this notebook, we will build the JIT-DWH</li>
<li><strong>Query The JIT-DWH</strong>: in this notebook, we will perform some queries over the JIT-DWH</li>
</ul>
<h3 id="heading-the-use-case">The Use-Case:</h3>
<p>We’ll create a fake social network like dataset. This will activate the CDC module of Neo4j Stream, and via Apache Spark we’ll intercept this event and persist them on the File System as JSON.</p>
<p>Then we’ll demonstrate how new fields added in our nodes will be automatically added to our JIT-DWL without the modification of the ETL pipeline, thanks to the Schema-On-Read approach.</p>
<p>We’ll execute the following steps:</p>
<ol>
<li>Create the fake data set</li>
<li>Build our data pipeline that intercepts the Kafka events published by the Neo4j Streams CDC module</li>
<li>Make the first query over our JIT-DWH on Spark</li>
<li>Add a new field in our graph model</li>
<li>Show how the new field is automatically exposed in real time thanks to the Neo4j Streams CDC module (without the need for changes over our ETL pipeline thanks to the Schema-On-Read approach).</li>
</ol>
<h3 id="heading-notebook-1-create-a-just-in-time-data-warehouse">Notebook 1: Create a Just-In-Time Data Warehouse</h3>
<p>We’ll create a fake social network by using the APOC <code>apoc.periodic.repeat</code> procedure that executes this query every 15 seconds:</p>
<pre><code>WITH [<span class="hljs-string">"M"</span>, <span class="hljs-string">"F"</span>, <span class="hljs-string">""</span>] AS genderUNWIND range(<span class="hljs-number">1</span>, <span class="hljs-number">10</span>) AS idCREATE (p:Person {<span class="hljs-attr">id</span>: apoc.create.uuid(), <span class="hljs-attr">name</span>: <span class="hljs-string">"Name-"</span> +  apoc.text.random(<span class="hljs-number">10</span>), <span class="hljs-attr">age</span>: round(rand() * <span class="hljs-number">100</span>), <span class="hljs-attr">index</span>: id, <span class="hljs-attr">gender</span>: gender[toInteger(size(gender) * rand())]})WITH collect(p) AS peopleUNWIND people AS p1UNWIND range(<span class="hljs-number">1</span>, <span class="hljs-number">3</span>) AS friendWITH p1, people[(p1.index + friend) % size(people)] AS p2CREATE (p1)-[:KNOWS{<span class="hljs-attr">years</span>: round(rand() * <span class="hljs-number">10</span>), <span class="hljs-attr">engaged</span>: (rand() &gt; <span class="hljs-number">0.5</span>)}]-&amp;gt;(p2)
</code></pre><p>If you need more details about the APOC project, please follow this <a target="_blank" href="https://neo4j-contrib.github.io/neo4j-apoc-procedures/">link</a>.</p>
<p>So the resulting graph model is quite straightforward:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/ZxNqMi-FAYLjHLNr7hFEJ21Bdoq6UtA2Qus2" alt="Image" width="600" height="400" loading="lazy">
<em>The Graph Model</em></p>
<p>Let’s create an index over the Person node:</p>
<pre><code>%neo4jCREATE INDEX ON :Person(id)
</code></pre><p>Now let’s set the Background Job in Neo4j:</p>
<pre><code>%neo4jCALL apoc.periodic.repeat(<span class="hljs-string">'create-fake-social-data'</span>, <span class="hljs-string">'WITH ["M", "F", "X"] AS gender UNWIND range(1, 10) AS id CREATE (p:Person {id: apoc.create.uuid(), name: "Name-" +  apoc.text.random(10), age: round(rand() * 100), index: id, gender: gender[toInteger(size(gender) * rand())]}) WITH collect(p) AS people UNWIND people AS p1 UNWIND range(1, 3) AS friend WITH p1, people[(p1.index + friend) % size(people)] AS p2 CREATE (p1)-[:KNOWS{years: round(rand() * 10), engaged: (rand() &gt; 0.5)}]-&gt;(p2)'</span>, <span class="hljs-number">15</span>) YIELD nameRETURN name AS created
</code></pre><p>This background query brings the Neo4j-Streams CDC module to stream related events over the “neo4j” Kafka topic (the default topic of the CDC).</p>
<p>Now let’s create a Structured Streaming Dataset that consumes the data from the “neo4j” topic:</p>
<pre><code>val kafkaStreamingDF = (spark    .readStream    .format(<span class="hljs-string">"kafka"</span>)    .option(<span class="hljs-string">"kafka.bootstrap.servers"</span>, <span class="hljs-string">"broker:9093"</span>)    .option(<span class="hljs-string">"startingoffsets"</span>, <span class="hljs-string">"earliest"</span>)    .option(<span class="hljs-string">"subscribe"</span>, <span class="hljs-string">"neo4j"</span>)    .load())
</code></pre><p>The <code>kafkaStreamingDF</code> Dataframe is basically a <code>ProducerRecord</code> representation. And in fact its schema is:</p>
<pre><code>root|-- key: binary (nullable = <span class="hljs-literal">true</span>)|-- value: binary (nullable = <span class="hljs-literal">true</span>)|-- topic: string (nullable = <span class="hljs-literal">true</span>)|-- partition: integer (nullable = <span class="hljs-literal">true</span>)|-- offset: long (nullable = <span class="hljs-literal">true</span>)|-- timestamp: timestamp (nullable = <span class="hljs-literal">true</span>)|-- timestampType: integer (nullable = <span class="hljs-literal">true</span>)
</code></pre><p>Now let’s create the Structure of the data streamed by the CDC using the Spark APIs in order to read the streamed data:</p>
<pre><code>val cdcMetaSchema = (<span class="hljs-keyword">new</span> StructType()    .add(<span class="hljs-string">"timestamp"</span>, LongType)    .add(<span class="hljs-string">"username"</span>, StringType)    .add(<span class="hljs-string">"operation"</span>, StringType)    .add(<span class="hljs-string">"source"</span>, MapType(StringType, StringType, <span class="hljs-literal">true</span>)))    val cdcPayloadSchemaBeforeAfter = (<span class="hljs-keyword">new</span> StructType()    .add(<span class="hljs-string">"labels"</span>, ArrayType(StringType, <span class="hljs-literal">false</span>))    .add(<span class="hljs-string">"properties"</span>, MapType(StringType, StringType, <span class="hljs-literal">true</span>)))    val cdcPayloadSchema = (<span class="hljs-keyword">new</span> StructType()    .add(<span class="hljs-string">"id"</span>, StringType)    .add(<span class="hljs-string">"type"</span>, StringType)    .add(<span class="hljs-string">"label"</span>, StringType)    .add(<span class="hljs-string">"start"</span>, MapType(StringType, StringType, <span class="hljs-literal">true</span>))    .add(<span class="hljs-string">"end"</span>, MapType(StringType, StringType, <span class="hljs-literal">true</span>))    .add(<span class="hljs-string">"before"</span>, cdcPayloadSchemaBeforeAfter)    .add(<span class="hljs-string">"after"</span>, cdcPayloadSchemaBeforeAfter))    val cdcSchema = (<span class="hljs-keyword">new</span> StructType()    .add(<span class="hljs-string">"meta"</span>, cdcMetaSchema)    .add(<span class="hljs-string">"payload"</span>, cdcPayloadSchema))
</code></pre><p>The <code>cdcSchema</code> is suitable for both node and relationships events.</p>
<p>What we need now is to extract only the CDC event from the Dataframe, so let’s perform a simple transformation query over Spark:</p>
<pre><code>val cdcDataFrame = (kafkaStreamingDF    .selectExpr(<span class="hljs-string">"CAST(value AS STRING) AS VALUE"</span>)    .select(from_json(<span class="hljs-string">'VALUE, cdcSchema) as '</span><span class="hljs-built_in">JSON</span>))
</code></pre><p>The <code>cdcDataFrame</code> contains just one column <strong>JSON</strong> which is the data streamed from the Neo4j-Streams CDC module.</p>
<p>Let’s perform a simple ETL query in order to extract fields of interest:</p>
<pre><code>val dataWarehouseDataFrame = (cdcDataFrame    .where(<span class="hljs-string">"json.payload.type = 'node' and (array_contains(nvl(json.payload.after.labels, json.payload.before.labels), 'Person'))"</span>)    .selectExpr(<span class="hljs-string">"json.payload.id AS neo_id"</span>, <span class="hljs-string">"CAST(json.meta.timestamp / 1000 AS Timestamp) AS timestamp"</span>,        <span class="hljs-string">"json.meta.source.hostname AS host"</span>,        <span class="hljs-string">"json.meta.operation AS operation"</span>,        <span class="hljs-string">"nvl(json.payload.after.labels, json.payload.before.labels) AS labels"</span>,        <span class="hljs-string">"explode(json.payload.after.properties)"</span>))
</code></pre><p>This query is quite important, because it represents how the data will be persisted over the filesystem. Every node will be <strong>exploded</strong> in a number of JSON snippets, one for each node property, just like this:</p>
<pre><code>{<span class="hljs-string">"neo_id"</span>:<span class="hljs-string">"35340"</span>,<span class="hljs-string">"timestamp"</span>:<span class="hljs-string">"2018-12-19T23:07:10.465Z"</span>,<span class="hljs-string">"host"</span>:<span class="hljs-string">"neo4j"</span>,<span class="hljs-string">"operation"</span>:<span class="hljs-string">"created"</span>,<span class="hljs-string">"labels"</span>:[<span class="hljs-string">"Person"</span>],<span class="hljs-string">"key"</span>:<span class="hljs-string">"name"</span>,<span class="hljs-string">"value"</span>:<span class="hljs-string">"Name-5wc62uKO5l"</span>}
</code></pre><pre><code>{<span class="hljs-string">"neo_id"</span>:<span class="hljs-string">"35340"</span>,<span class="hljs-string">"timestamp"</span>:<span class="hljs-string">"2018-12-19T23:07:10.465Z"</span>,<span class="hljs-string">"host"</span>:<span class="hljs-string">"neo4j"</span>,<span class="hljs-string">"operation"</span>:<span class="hljs-string">"created"</span>,<span class="hljs-string">"labels"</span>:[<span class="hljs-string">"Person"</span>],<span class="hljs-string">"key"</span>:<span class="hljs-string">"index"</span>,<span class="hljs-string">"value"</span>:<span class="hljs-string">"8"</span>}
</code></pre><pre><code>{<span class="hljs-string">"neo_id"</span>:<span class="hljs-string">"35340"</span>,<span class="hljs-string">"timestamp"</span>:<span class="hljs-string">"2018-12-19T23:07:10.465Z"</span>,<span class="hljs-string">"host"</span>:<span class="hljs-string">"neo4j"</span>,<span class="hljs-string">"operation"</span>:<span class="hljs-string">"created"</span>,<span class="hljs-string">"labels"</span>:[<span class="hljs-string">"Person"</span>],<span class="hljs-string">"key"</span>:<span class="hljs-string">"id"</span>,<span class="hljs-string">"value"</span>:<span class="hljs-string">"944e58bf-0cf7-49cf-af4a-c803d44f222a"</span>}
</code></pre><pre><code>{<span class="hljs-string">"neo_id"</span>:<span class="hljs-string">"35340"</span>,<span class="hljs-string">"timestamp"</span>:<span class="hljs-string">"2018-12-19T23:07:10.465Z"</span>,<span class="hljs-string">"host"</span>:<span class="hljs-string">"neo4j"</span>,<span class="hljs-string">"operation"</span>:<span class="hljs-string">"created"</span>,<span class="hljs-string">"labels"</span>:[<span class="hljs-string">"Person"</span>],<span class="hljs-string">"key"</span>:<span class="hljs-string">"gender"</span>,<span class="hljs-string">"value"</span>:<span class="hljs-string">"F"</span>}
</code></pre><p>This kind of structure can be easily turned into tabular representation (we’ll see in the next few steps how to do this).</p>
<p>Now let's write a Spark continuous streaming query that saves the data to the file system as JSON:</p>
<pre><code>val writeOnDisk = (dataWarehouseDataFrame    .writeStream    .format(<span class="hljs-string">"json"</span>)    .option(<span class="hljs-string">"checkpointLocation"</span>, <span class="hljs-string">"/zeppelin/spark-warehouse/jit-dwh/checkpoint"</span>)    .option(<span class="hljs-string">"path"</span>, <span class="hljs-string">"/zeppelin/spark-warehouse/jit-dwh"</span>)    .queryName(<span class="hljs-string">"nodes"</span>)    .start())
</code></pre><p>We have now created a simple JIT-DWH. In the second notebook we’ll learn how to query it and how simple it is to deal with dynamical changes in the data structures thanks schema-on-read.</p>
<h3 id="heading-notebook-2-query-the-jit-dwh">Notebook 2: Query The JIT-DWH</h3>
<p>The first paragraph let us query and display our JIT-DWH</p>
<pre><code>val flattenedDF = (spark.read.format(<span class="hljs-string">"json"</span>).load(<span class="hljs-string">"/zeppelin/spark-warehouse/jit-dwh/**"</span>)    .where(<span class="hljs-string">"neo_id is not null"</span>)    .groupBy(<span class="hljs-string">"neo_id"</span>, <span class="hljs-string">"timestamp"</span>, <span class="hljs-string">"host"</span>, <span class="hljs-string">"labels"</span>, <span class="hljs-string">"operation"</span>)    .pivot(<span class="hljs-string">"key"</span>)    .agg(first($<span class="hljs-string">"value"</span>)))z.show(flattenedDF)
</code></pre><p>Remember how we saved the data in JSON some row above? The <code>flattenedDF</code> simply pivoted the JSONs over the <code>key</code> field thus grouping the data over 5 columns that represent the “unique key” (_“neo<em>id”, “timestamp”, “host”, “labels”, “operation”</em>). This allows us to have this tabular representation of the source data as follows:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/XA4vskTWdUra50ncym941E78zHZcwGM6TY4q" alt="Image" width="600" height="400" loading="lazy">
<em>The result of the query</em></p>
<p>Now imagine that our Person dataset gets a new field: <strong>birth.</strong> Let's add this new field to one node; in this case, you must choose an id from your dataset and update it with the following paragraph:</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/onz1jNYkzyiaEAFAcYi-4f2lay8rtqs55PBM" alt="Image" width="600" height="400" loading="lazy">
<em>Just fill the form with your data and execute the paragraph</em></p>
<p>Now the final step: reuse the same query and filter the DWH by the id that we have previously changed in order to check how our dataset changed according to the changes made over Neo4j.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/dTgF8U8-F--yYJOUm3BLa70MaOd0E5XAKnaH" alt="Image" width="600" height="400" loading="lazy">
<em>The birth field is present without changes to our queries</em></p>
<h3 id="heading-conclusions">Conclusions</h3>
<p>In this first part, we learned how to leverage the events produced by Neo4j Stream CDC module in order to build a simple (Real-Time) JIT-DWL that uses the Schema-On-Read approach.</p>
<p>In Part 2 we’ll discover how to use the Sink module in order to ingest data into Neo4j directly from Kafka.</p>
<p>If you have already tested the Neo4j-Streams module or tested it via these notebooks please fill out our <a target="_blank" href="https://goo.gl/forms/VLwvqwsIvdfdm9fL2"><strong>feedback survey</strong></a>.</p>
<p>If you run into any issues or have thoughts about improving our work, <a target="_blank" href="http://github.com/neo4j-contrib/neo4j-streams/issues">please raise a GitHub issue</a>.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ How to embrace event-driven graph analytics using Neo4j and Apache Kafka ]]>
                </title>
                <description>
                    <![CDATA[ By Ljubica Lazarevic Introduction With the new Neo4j Kafka streams now available, my fellow Neo4j colleague Tom Geudens and I were keen to try it out. We have many use-cases in mind that leverage the power of graph databases and event-driven architec... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/how-to-embrace-event-driven-graph-analytics-using-neo4j-and-apache-kafka-474c9f405e06/</link>
                <guid isPermaLink="false">66c351e4765a634c3485fe12</guid>
                
                    <category>
                        <![CDATA[ analytics ]]>
                    </category>
                
                    <category>
                        <![CDATA[ data ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Neo4j ]]>
                    </category>
                
                    <category>
                        <![CDATA[ General Programming ]]>
                    </category>
                
                    <category>
                        <![CDATA[ tech  ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Thu, 24 Jan 2019 08:12:47 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/0*MUKvlO22WXUc03qd" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Ljubica Lazarevic</p>
<h3 id="heading-introduction">Introduction</h3>
<p>With the new <a target="_blank" href="https://neo4j-contrib.github.io/neo4j-streams/">Neo4j Kafka streams</a> now available, my fellow Neo4j colleague <a target="_blank" href="https://twitter.com/tomgeudens"><strong>Tom Geudens</strong></a> and I were keen to try it out. We have many use-cases in mind that leverage the power of graph databases and event-driven architectures. The first one we explore combines the power of Graph Algorithms with a transactional database.</p>
<p>The new Neo4j Kafka streams library is a Neo4j plugin that you can add to each of your Neo4j instances. It enables three types of Apache Kafka mechanisms:</p>
<ul>
<li>Producer: based on the topics set up in the Neo4j configuration file. Outputs to said topics will happen when specified node or relationship types change</li>
<li>Consumer: based on the topics set up in the Neo4j configuration file. When events for said topics are picked up, the specified Cypher query for each topic will be executed</li>
<li>Procedure: a direct call in Cypher to publish a given payload to a specified topic</li>
</ul>
<p>You can get a more detailed overview of how each of these might look like <a target="_blank" href="https://neo4j-contrib.github.io/neo4j-streams/">here</a>.</p>
<h3 id="heading-overview-of-the-situation">Overview of the situation</h3>
<p>Graph algorithms provide powerful analytical abilities. They help us understand the context of our data better by analysing relationships. For example, graph algorithms are used to:</p>
<ul>
<li>Understand network dependencies</li>
<li>Detect communities</li>
<li>Identify influencers</li>
<li>Calculate recommendations</li>
<li>And so forth.</li>
</ul>
<p>Neo4j offers a set of <a target="_blank" href="https://neo4j.com/docs/graph-algorithms/current/">graph algorithms</a> out of the box via a plugin that can run directly on data within Neo4j. This library of algorithms has been very popularly received. Many times I’ve received feedback that the plugins are as fast or faster than what clients have used before. With such wonderful feedback, why wouldn’t we want to apply these optimised and performant algorithms on a Neo4j database?</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/p5CjHoBNN8tRfzY09tA-td5jG7N2Rybn-3GW" alt="Image" width="600" height="400" loading="lazy">
<em>The Neo4j graph algorithm categories</em></p>
<p>Getting the full advantage of any analytical process needs resources. To get a nice, performant experience, we want to provide as much CPU and memory as we can afford.</p>
<p>Now, we could run this kind of work on our transactional cluster. But in this typical architecture, we’re going to run into some challenges. For example, if one machine is big, the other machines in the cluster should be matching. This could mean that the set up architecture is expensive.</p>
<p>The other challenge we face is that our cluster is supposed to be managing transactions — day-to-day queries such as processing requests. We don’t want to weigh it down with crunching through various iterations and permutations of a model. Ideally, we want to offload this along with associated analytical work.</p>
<p>If we know that the heavy querying that is going to take place is read-only, then it’s an easy solution. We can spin up read replicas to manage the load. This keeps the cluster focussed on what it’s supposed to be doing, supporting an operational, transactional system.</p>
<p>But how do we handle write backs to the operational graph as part of the analytical processing? We want those results, such as recommendations, as soon as they are available.</p>
<p>Read replicas are as the name suggests — they are for read-only applications. They will not be involved in either elections of leaders in the cluster, nor in writing. Using Neo4j-streams, we can stream the results back from the read replica back to the cluster for consumption.</p>
<p>The big advantages of approaching it this way include:</p>
<ul>
<li>We have our high availability/disaster recovery afforded to us from the cluster.</li>
<li>The data is going to be identical on both the read replica and the cluster. We don’t have to worry about updating the read replica because the cluster is going to take care of that for us.</li>
<li>The id’s for nodes and relationships will be identical on both the servers of the cluster, and the read replica. This makes updating really easy.</li>
<li>We can provision resources as necessary to the read replica, which is likely to be very different from the cluster.</li>
</ul>
<p>Our architecture will look like the figure below. A is our read replica, and B is our causal cluster. A will receive transactional information from B. Any results calculated by A will be streamed back to B via Kafka messages.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/dlUfqTqASS6Q4yXg1zHCJZ97Ez-ufEMiqESh" alt="Image" width="600" height="400" loading="lazy"></p>
<p>So with our updated pattern, let’s continue with our simple example.</p>
<h3 id="heading-the-example-data-set">The Example Data Set</h3>
<p>We’re going to use the Movie Database data set available from the <code>:play movie-guide</code> guide in Neo4j Browser. For this example we are going to use four Neo4j instances:</p>
<ul>
<li>The analytics instance — this is going to be our read replica, and on this instance we’re going to run PageRank on all Person nodes in the data set. We will call the <code>streams.publish()</code> procedure to post the output to our Kafka topic.</li>
<li>The operational instances — this is going be our three-server causal cluster which is going to be listening for any changes to the person node. We will update as changes come in.</li>
</ul>
<p>For Kafka, we’ll follow the instructions from the <a target="_blank" href="https://kafka.apache.org/quickstart">quick start guide</a> up until step 2. Before we get Kafka up and running, we will need to set up the consumer elements in the Neo4j configuration files. We also will set up the cluster itself. Please note that at the moment neo4j-streams only works with <strong>Neo4j version 3.4.x</strong>.</p>
<p>To set up the three server clusters and a read replica, we will follow the instructions provided in the <a target="_blank" href="https://neo4j.com/docs/operations-manual/current/tutorial/local-causal-cluster/">Neo4j operations manual</a>. Follow the instructions for the cores, and also for one read replica.</p>
<p>Additionally, we’re going to need to add the following to <strong>neo4j.config</strong> for the causal cluster servers:</p>
<pre><code>#************# Kafka Config — Consumer#************kafka.zookeeper.connect=localhost:<span class="hljs-number">2181</span>kafka.bootstrap.servers=localhost:<span class="hljs-number">9092</span>kafka.group.id=neo4j-core1streams.sink.enabled=truestreams.sink.topic.cypher.neorr=WITH event.payload <span class="hljs-keyword">as</span> payload MATCH (p:Person) WHERE ID(p)=payload[<span class="hljs-number">0</span>] SET p.pagerank = payload[<span class="hljs-number">1</span>]
</code></pre><p>Note that we want to change <code>kafka.group.id</code> to <code>neo4j-core2</code> and <code>neo4j-core3</code> respectively.</p>
<p>For the read replica, we’ll need to add the following to <strong>neo4j.config</strong>:</p>
<pre><code>#************# Kafka Config - Procedure#************kafka.zookeeper.connect=localhost:<span class="hljs-number">2181</span>kafka.bootstrap.servers=localhost:<span class="hljs-number">9092</span>kafka.group.id=neo4j-read1
</code></pre><p>You will need ti download and save the neo4j-streams jar into the <strong>plugins</strong> folder. Also you need to add the graph algorithm library, via Neo4j Desktop, or <a target="_blank" href="https://neo4j.com/docs/graph-algorithms/current/introduction/#_installation">manually</a>.</p>
<p>With these changes to the respective config files set and saved and plugins installed, we will start everything up, in the following order:</p>
<ul>
<li>Apache Zookeeper</li>
<li>Apache Kafka</li>
<li>The three instances for the Neo4j causal cluster</li>
<li>The read replica</li>
</ul>
<p>Once all of the Neo4j instances are up and running and the cluster has discovered all of the members, we can now run the following query on the read replica:</p>
<pre><code>CALL algo.pageRank.stream(<span class="hljs-string">'MATCH (p:Person) RETURN id(p) AS id'</span>,<span class="hljs-string">'MATCH (p1:Person)--&gt;()&lt;--(p2:Person) RETURN distinct id(p1) AS source, id(p2) AS target'</span>,{<span class="hljs-attr">graph</span>:<span class="hljs-string">'cypher'</span>}) YIELD nodeId, scoreWITH [nodeId,score] AS resCALL streams.publish(<span class="hljs-string">'neorr'</span>,res)RETURN COUNT(*)
</code></pre><p>This Cypher query will call the <a target="_blank" href="https://neo4j.com/docs/graph-algorithms/current/algorithms/page-rank/">PageRank</a> algorithm with the specified configuration. Once the algorithm is complete, we will stream the returned node id’s and the PageRank score to the specified topic.</p>
<p>We can have a look at what the neorr topic looks like by running Step 5 of the <a target="_blank" href="https://kafka.apache.org/quickstart">Apache Kafka quick start guide</a> (replacing <code>test</code> with <code>neorr</code>):</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/hmHR0G3NWw8HQVhnN10JN0XpCSBqWhj6i2Jy" alt="Image" width="600" height="400" loading="lazy"></p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/gfuq2bK5PKK67Whox2xOyllilj5XazInfuU2" alt="Image" width="600" height="400" loading="lazy"></p>
<h3 id="heading-summary">Summary</h3>
<p>In this post we’ve demonstrated:</p>
<ul>
<li>Separating transactional and analytical data concerns</li>
<li>Painlessly flowing analytical results back back for real-time consumption</li>
</ul>
<p>Whilst we’ve used a simple example, you can see how complex analytical work can be carried out, supporting an event-driven architecture.</p>
 ]]>
                </content:encoded>
            </item>
        
            <item>
                <title>
                    <![CDATA[ Monitoring the French Presidential Election on Twitter with Python ]]>
                </title>
                <description>
                    <![CDATA[ By Romain Thalineau A while ago I read this nice article from Laurent Luce where he explained how he implemented a system that collected the tweets related to the 2012 French presidential election. The article is very well written, and I highly recom... ]]>
                </description>
                <link>https://www.freecodecamp.org/news/monitoring-the-french-presidential-election-on-twitter-with-python-6a2a9310e6f4/</link>
                <guid isPermaLink="false">66c35b7d9de50ee9ca7fa70d</guid>
                
                    <category>
                        <![CDATA[ Neo4j ]]>
                    </category>
                
                    <category>
                        <![CDATA[ politics ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Python ]]>
                    </category>
                
                    <category>
                        <![CDATA[ tech  ]]>
                    </category>
                
                    <category>
                        <![CDATA[ Web Development ]]>
                    </category>
                
                <dc:creator>
                    <![CDATA[ freeCodeCamp ]]>
                </dc:creator>
                <pubDate>Sun, 12 Feb 2017 09:19:26 +0000</pubDate>
                <media:content url="https://cdn-media-1.freecodecamp.org/images/1*Gm6Q_bRGS6yJWRuESpPx5w.png" medium="image" />
                <content:encoded>
                    <![CDATA[ <p>By Romain Thalineau</p>
<p>A while ago I read <a target="_blank" href="http://www.laurentluce.com/posts/python-twitter-statistics-and-the-2012-french-presidential-election/">this nice article</a> from Laurent Luce where he explained how he implemented a system that collected the tweets related to the 2012 French presidential election. The article is very well written, and I highly recommend reading it.</p>
<p>This gave me the idea to implement something similar for the 2017 election. But I wanted to add some features:</p>
<ul>
<li>Instead of using a SQL database for storing the data, I wanted to use a Graph database. The main reason was to experiment with such a system, but it’s fairly easy to see how this is a good fit for social media data.</li>
<li>I wanted to be able to monitor the data in real time. Practically speaking, this means that the data need to be processed as they arrive. This would also involve serving the analyzed data to a web site with data visualizations.</li>
<li>Ideally I wanted to run a sentiment analysis on the tweets. I would train a learning algorithm and implement it along the data pipeline to serve its results in real time.</li>
</ul>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*y9G8AIt2rJnWwhjdv_Zn0w.png" alt="Image" width="800" height="552" loading="lazy">
_[Time Series Analysis](https://www.auguratech.com/#/twitter/time_series" rel="noopener" target="<em>blank" title=")</em></p>
<p>Well, I managed to build all of this. You can go to see how it looks like on <a target="_blank" href="https://www.auguratech.com/#/twitter">my personal website</a>. So far, there are two simple analyses:</p>
<ul>
<li><a target="_blank" href="https://www.auguratech.com/#/twitter/time_series">The first one</a> is a time series analysis, which shows the numbers of tweets per candidates as a function of the date. Besides being able to select the starting/ending date and the period, you can also display just the candidates you would like to see by clicking on their names in the visualization.</li>
<li><a target="_blank" href="https://www.auguratech.com/#/twitter/geospatial">The second analysis</a> displays the geolocation of the tweets. The options are relatively similar to the first analysis.</li>
</ul>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*G8iD7P81--DVJf1NTDTmbA.png" alt="Image" width="800" height="624" loading="lazy">
_[Tweet geolocation analysis](https://www.auguratech.com/#/twitter/geospatial" rel="noopener" target="<em>blank" title=")</em></p>
<p>For collecting the data from Twitter, I used an approach similar to Laurent Luce. Instead of focusing on the similarities, I’ll show you the approaches I took that were different.</p>
<h4 id="heading-storing-the-tweets-in-a-graph-database">Storing the tweets in a graph database</h4>
<p>As I said, I wanted to store the data in a graph database. I chose to use <a target="_blank" href="https://neo4j.com/">Neo4J</a>. In a graph database, data are modeled using a combination of nodes, edges, and properties structures.</p>
<p><img src="https://cdn-media-1.freecodecamp.org/images/1*XlHtECBpilVo7Jk7ujcCbA.png" alt="Image" width="330" height="271" loading="lazy">
_[Image credit](http://network.graphdemos.com/" rel="noopener" target="<em>blank" title=")</em></p>
<p>In our case, nodes can represent a tweet, a user or even a hashtag. They can be distinguished by using a label. The relationship between nodes is handled by connecting them through edges. For example, a user node can be connected to a tweet node via a POSTS relationship.</p>
<p>The relationships are directional. A tweet can’t POST a user, but it can MENTION a user.</p>
<p>Finally both nodes and edges (relationships) can hold properties. For example, a user has a name and a tweet has text.</p>
<p>When interacting with a graph database, Object Graph Mapper (OGM) are particularly useful. In this project, I’ve been using <a target="_blank" href="https://github.com/robinedwards/neomodel">Neomodel</a>. It exposes an API relatively similar to the Django models API. You define your models like:</p>
<p>As you can see, both the property and the relationships are defined. I invite you checking the models file in <a target="_blank" href="https://github.com/romaintha/twitter/blob/master/twitter/models.py">my github repo</a> to see the full data model definition.</p>
<p>Neo4J being a NoSQL database, it uses a non-SQL query language called Cypher. It’s a pretty straightforward language. For instance, the following query will return all the tweets posted by a user that contain the word “fillon” (one of the candidates):</p>
<pre><code>MATCH (u:User)-[:POSTS]-&gt;(t:Tweet) WHERE t.text contains <span class="hljs-string">"fillon"</span> <span class="hljs-keyword">return</span> t
</code></pre><p>Neomodel being an OGM, it provides an API so you don’t have to write very many queries manually. You can obtained the same results as above by running:</p>
<pre><code>Tweet.nodes.filter(text__contains=<span class="hljs-string">"fillon"</span>)
</code></pre><h4 id="heading-streaming-from-twitter">Streaming from Twitter</h4>
<p>Twitter provides two ways to get their data. The first one is through a standard REST API. Each endpoint access is limited, so it isn’t the preferred solution in our case.</p>
<p>Luckily, Twitter also provides a streaming API. By setting a filter, we can receive all the tweets that pass this filter (with a limit of 1% of the global amount of tweets published at instant t). The library <a target="_blank" href="https://github.com/tweepy/tweepy">Tweepy</a> facilitates this process.</p>
<p>As you can see in <a target="_blank" href="https://github.com/romaintha/twitter/blob/master/twitter/streaming_api.py">my repo</a>, you need to define a Listener class, which will trigger some actions while streaming. For instance, the method “on_status” is called any time a tweet is streamed.</p>
<p>In addition, I defined a Streaming class whose responsibilities are to authenticate to Twitter, to instantiate a Tweepy stream with the above Listener, and to expose a method to start streaming. The “start_streaming” method accepts a “to_track” argument, which is a list of words on which you want to filter.</p>
<p>You have to instantiate the Streaming class with a bunch of arguments. Aside from the Twitter API credentials, you need “pipeline” and “batch_size” arguments. The latter is a number specifying the amount of tweets that are processed at once.</p>
<p>Since processing a tweet involves saving it to Neo4J, doing it one by one is a very costly operation. Saving them by batches of 100 (or even more in some cases) improves performance considerably.</p>
<p>The “pipeline” argument must be a reference to a function, which will receive the batch of tweets. Inside of this, you are free to do whatever you want. I provided an example of it in the <a target="_blank" href="https://github.com/romaintha/twitter/blob/master/twitter/utils.py">utils.py</a> module.</p>
<p>As you can see, this function makes a call to an asynchronous Celery task defined in the <a target="_blank" href="https://github.com/romaintha/twitter/blob/master/twitter/tasks.py">tasks.py</a> module. <a target="_blank" href="http://www.celeryproject.org/">Celery</a> is a Python distributed task queue library. I used it with <a target="_blank" href="https://www.rabbitmq.com/">RabbitMQ</a> as a message broker. So how does it work? Let us get back to the “streaming_pipeline” function in the <a target="_blank" href="https://github.com/romaintha/twitter/blob/master/twitter/utils.py">utils.py</a> module, and focus on this line:</p>
<pre><code>bulk_parsing.delay(users_attributes, tweets_attributes)
</code></pre><p>When this line is processed, instead of processing the “bulk_parsing” function synchronously, a message will be published to a broker (here RabbitMQ). It allows for consumers (workers) to retrieve these messages, and therefore to process the “bulk_parsing” task asynchronously and in parallel. Why’s that? Because it enables horizontal scaling of tweet processing. If the messages accumulate faster than you can process them, you can add more workers to help consume them.</p>
<p>One final remark. I wanted the process to be as versatile as possible, in the sense that if the processing needed to be change — or if something needed to be added — it must be easy to do so. In this case, I can just change the “streaming_pipeline” function and add some asynchronous tasks. It’s quick and easy to modify.</p>
<p>Thanks for reading!</p>
<ul>
<li>Be sure to check out the code <a target="_blank" href="https://github.com/romaintha/twitter">in my Github repo</a>.</li>
<li>You can see all this in action <a target="_blank" href="http://network.graphdemos.com/">on my site</a>, where I used this to feed some analysis.</li>
</ul>
 ]]>
                </content:encoded>
            </item>
        
    </channel>
</rss>
