Question about Wikipedia API involving search results

glennqhurd · December 3, 2017, 4:37am

I just got started on the Wikipedia, looked through some of the posts about it on these forums to find out how to do a getJSON query using jQuery. Here’s the code I’m using, I got the framework from a different post:

$.getJSON("https://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=" + document.getElementById("search_input").value + "&rvprop=content&format=json&rvsection=2&rvparse=5&callback=?", function(result) {
    console.log(result);
  });

So my question is how do I find multiple pages worth of JSON using this code, and if I have to change it what should I change? I know most of the contents from reading the Wikipedia API manual but I don’t know how rvsection and rvparse work which look like the modifiers that determine the results. Originally rvsection was 0 and rvparse was 1, after changing the JSON results were the same. Search words I used include Brain, Flowers, and Pokemon. (I was just doing words off the top of my head that would likely have multiple choices) Any help is appreciated, I really want to understand how this works and the API FAQ was kind of obscure beyond the very basics for me.

PortableStick · December 3, 2017, 2:42pm

I agree. The WikiMedia API is thoroughly confusing and the documentation doesn’t help one bit. Though I have no idea what these rv* options do and I don’t know what you mean by “find multiple pages worth of JSON”, I can link to the settings I use and know work: link.

glennqhurd · December 4, 2017, 12:36am

OK, what I meant by “find multiple pages worth of JSON” is part of the returned JSON Object is an array labeled pages and each array element has stuff like pageid. I want to be able to get a certain number of pages at a time for my website then allow the user to see buttons/links to those pages. What I’m getting is one page at a time for search queries that I thought would have many different pages to choose from.

glennqhurd · December 4, 2017, 4:57am

Update: The get request code I was using was just for the exact site that matches the keyword. This wasn’t what I wanted so I did some digging and got an API address to retrieve from:

"https://en.wikipedia.org//w/api.php?action=query&format=json&list=search&continue=&srsearch=" + document.getElementById("search_input").value + "&srwhat=text&srprop=timestamp"

However, when I use this code (a slightly changed version of code from the WikiMedia API help site) I get this error:
No ‘Access-Control-Allow-Origin’ header is present on the requested resource.

I did some more digging and I found this tutorial for CORS from Stack Overflow: https://www.html5rocks.com/en/tutorials/cors/

My question this time is am I on the right path or do you guys know of a simpler method of doing this w/o using something complicated like CORS?

Thanks for the help so far.

PortableStick · December 4, 2017, 7:15pm

Can you supply a link to your code?

glennqhurd · December 5, 2017, 1:53am

Sure, the CodePen where I’m working on this is at https://codepen.io/glennqhurd/pen/oomXmq.

PortableStick · December 5, 2017, 2:57am

XHR is complicated and error prone. Your problem will probably be fixed by using the fetch API. If you haven’t used it yet, it’s quite simple, though if you’re unfamiliar with promises it will take a bit of getting used to. Here’s the basic form that should work for you.

fetch("https://some.api")
    .then(function(response) {
        if(response.ok) return response.json();
        throw Error(response.statusText);
    })
    .then(function(data) {
        // do something with the data in JSON format
    })
    .catch(function(errorMessage) {
        // do something with the error message
    })

glennqhurd · December 6, 2017, 9:20pm

OK here’s the API call that I got to work:

$.getJSON("https://en.wikipedia.org/w/api.php?action=query&format=json&list=search&continue=&srsearch=" + document.getElementById("search_input").value + "&srwhat=text&srprop=timestamp&origin=*", function(result) {
    console.log(result);
  });

It’s different from my previous attempts because it uses the correct keywords. It returns a JSON object w/ 10 pages. Thanks for the suggestions though.

PortableStick · December 6, 2017, 10:05pm

Here’s the response I see when I search for “dog”

Were there any errors in your browser console when you tried?

glennqhurd · December 7, 2017, 11:24am

Nope, problem solved with that so far. I also found the URL for translating pageid into the actual page: "https://en.wikipedia.org/?curid=426957 will redirect you to the “Dog” page since the number matches the page id.

PortableStick · December 7, 2017, 4:59pm

You can also just append the title to the url https://en.wikipedia.org/wiki/, like this: https://en.wikipedia.org/wiki/Dog. This way it doesn’t need to redirect.

glennqhurd · December 7, 2017, 5:02pm

Gotcha, my version just does it based on the unique page id that I get back from the search.