Page MenuHomePhabricator

Support existing Wikibase REST API routes for Lexemes
Open, MediumPublicFeature

Description

Feature summary (what you would like to be able to do and where):
The existing Wikibase REST API routes work for Items. They should also work for Lexemes.

Use case(s) (list the steps that you performed to discover that problem, and describe the actual underlying problem which you want to solve. Do not describe only a solution):
Accessing data and editing data on Lexemes is needed to make all of Wikidata's data accessible for developers in the same API.

Benefits (why should this be implemented?):
The Wikibase REST API is more comprehensive and useful for application building.

Event Timeline

Lydia_Pintscher moved this task from Backlog to Later on the Wikibase REST API (archieved) board.

Sure, I'm attempting to work on this feature. Created a WblRestApi.php so far on my machine. My questions are as follows:

  • Are classes in src/DataAccess/Store/*Lookup.php for GET methods, and src/DataAccess/ChangeOp/*ChangeOp*.php for POST / PUT / DELETE etc. methods?
  • Is "serialisation" required for lexeme objects? If so, where are objects in charge located?
  • Is calling Middleware in Wikibase package an appropriate action, or should I create separate ones?
  • Should I create things in the structure of UseCases as in Wikibase?

1125623: Created WblRestApi class, stubs for RouteHandlers and (some) route landler classes | https://gerrit.wikimedia.org/r/c/mediawiki/extensions/WikibaseLexeme/+/1125623
Created this commit just now.

+2 by jenkins-bot on the patch above.

Change #1125623 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Created WblRestApi class, stubs for RouteHandlers and registered route handler classes in extension.json

https://gerrit.wikimedia.org/r/1125623

Hi @Unite_together! Thank you for your interest in expanding the Wikibase REST API to include WikibaseLexeme. As a general note, the Wikibase REST API follows an architectural pattern called “Hexagonal Architecture”. You can read more about it and how we follow it in the following links:

However, it is important to note that we do not yet follow this pattern in WikibaseLexeme itself, so there are some pivotal differences between the codebases that will have to be reconciled if you decide to follow that pattern. With regard to your specific questions, I will attempt to answer them inline.

Are classes in src/DataAccess/Store/*Lookup.php for GET methods, and src/DataAccess/ChangeOp/*ChangeOp*.php for POST / PUT / DELETE etc. methods?

While these classes can be used for these purposes, it is not necessary to use them as such. Currently, in the REST API for example, to get an Item we employ the base EntityRevisionLookup to retrieve Item revisions from the database. Referring to the chart in our docs, you can see that - for Wikibase - we consider The Wikibase Data Model namespace as the core part of the architecture and as our main data model. However, Lexemes can have quite different use cases and many variations of Lexeme lookups, Sense lookups and Form lookups, so the rules might not be as clear-cut on when to use which class. In Wikibase for example, we also have ChangeOp classes in repo/includes/ChangeOp in the Wikibase.git repository. Nonetheless, we chose not to use them, since they conflate different responsibilities together (the change itself, validation, edit summaries, etc.) and we wanted to make sure to try to adhere to the Single Responsibility principle (The S in SOLID) to ensure that we are only implementing features that are strictly required for REST API use cases. Thus, the best guide would be your use cases, when those are clearly defined. For an example of how we did it in the REST API, you may consult the following ticket, its sub-tasks, and the patches that were submitted to fulfill it: T337720: 🏠️ Provide an initial way of reading property data using Wikibase REST API. In particular, T337937: 🏠️ Implement GetPropertyData happy path shows an initial implementation of the GET /v0/entities/properties/{property_id} endpoint.

Is "serialisation" required for lexeme objects? If so, where are objects in charge located?

In any operation that we perform against the database or work with requests, we must serialize and deserialize strings to objects. This is due to the nature of how Lexemes are stored and the fact that JSON is the main transactional language between the client and the server in REST APIs. At the moment, there are serializers available for Lexemes, Forms and Senses at the src/Serialization directory in the WikibaseLexeme.git repository. However, as was mentioned, a lot of this depends on your use case. Back to our property example, you may consult T337839: 🏠️ Create property data serializer. In this subtask, we create a REST specific data serializer which combines already existing serializers in order to return a JSON response that matches the OpenAPI definition created in T337837: 🏠️ Add `GET /entities/properties/{property_id}` to the openapi definition.

Is calling Middleware in Wikibase package an appropriate action, or should I create separate ones?

While the Wikibase middleware can be reused, there’s no guarantee that it would perform exactly what the REST API for Lexemes would require. It is entirely feasible that you may find some of the existing middleware useful for your purposes, in which case we can also strive to make them more reusable once we know which ones are required, and whether we have available scope to achieve this. However, it could also be the case that the act of retrieving and editing Lexemes through the REST API might require its own middlewares which you will then need to create. A good golden rule here would be to try to create middlewares for those particular operations that have to happen across multiple endpoint requests and responses. In T337852: 🏠️ Apply middlewares (authentication, user agent, unexpected error handling) you can see an example of how we apply our already created middlewares to retrieving a single property. Although, as a first step, you can simply focus on the “happy path” of a request - response cycle as a prototype, and assume that no authentication or validation is required. This can get you to results quite quickly, which will surely make the exceptions to the rule clearer which will guide you on any required middleware.

Should I create things in the structure of UseCases as in Wikibase?

This is a relatively tough question to answer. The advantages we found in creating UseCase classes, is that it simplifies the way we apply and comprehend our code changes to match the expectations of our product manager, which in turn also helps us in maintaining the software we created. Therefore, to be able to answer that question, you would have to give some thought as to what the product requirements for Lexeme REST API endpoints are, and who will maintain these endpoints in the future. We found that a good place to start the discussion over requirements is to create an OpenAPI specification, which can lead to a fruitful exchange with our PMs and will also help you structure your code into use cases when the time comes. To read more about OpenAPI specifications, and also start sketching some ideas, see the following links:

Once you have the API specifications, you can start considering a singular use case as a prototype, which will help further the discussion.

In any way you decide to proceed with this journey, please feel free to contact us again and one of our teams or PMs will be happy to respond. Thanks again, for wanting to put some work into this topic. Looking forward to seeing what you come up with.

Hi @Unite_together,

In addition to the input by Itamar, I’d like to just emphasize that our team is currently focusing on other high priority projects (such as building out search in the REST API) and will continue to do so for this year.

We’ll do our best to review your patches and respond to your questions, but it will not always be feasible on our side to do so and response time might be slow.

In case you have urgent questions, feel free to tag me and we’ll find a way to get you some help.

@ItamarWMDE Should I change the structure of WikibaseLexeme extension to follow that of Wikibase package?

Change #1131000 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Created doc (openapi.json) for future REST API

https://gerrit.wikimedia.org/r/1131000

Change #1134118 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Reorganize objects in src/ to various locations to implement "hexagonal architecture" of OpenAPI.

https://gerrit.wikimedia.org/r/1134118

This patch above is an attempt to implement "hexagonal architecture" mentioned further above by @ItamarWMDE. It still has many many issues, and very likely to fail automatic tests. This is mainly because I'm unsure if some of those objects located in src/ are moved to correct locations. But I'm uploading it for now, as I'm really in need of some instructions to solve outstanding issues caused by this major adjustment.

Hi @Unite_together, the team is currently very low on resources and has some high priority commitments that need to be finished. I'd suggest you start with just one use case so the team can also look over a patch quicker versus a very large one where it might be hard to provide support.

Change #1134118 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Reorganize objects in src/ to various locations to implement "hexagonal architecture" of OpenAPI.

https://gerrit.wikimedia.org/r/1134118

https://integration.wikimedia.org/ci/job/mwext-php74-phan/88716/console

In this patch, I've been getting loads of PhanUndeclaredClass* errors reported by CI checks. Also, PhpStorm IDE on my machine is reporting numerous syntax errors in test units of WikibaseLexeme extension package. I really don't know if these are related nor how to solve this.

PHPUnit Prepare Parallel Run (Composer)
00:02:30.075 INFO:quibble.commands:>>> Start: PHPUnit Prepare Parallel Run (Composer)
00:02:30.553 > MediaWiki\Composer\PhpUnitSplitter\PhpUnitXmlManager::fetchResultsCache
00:02:30.553 
00:02:30.553 Unable to generate results cache URL - is LOG_PATH set?
00:02:30.553 
00:02:30.553 > MediaWiki\Composer\PhpUnitSplitter\PhpUnitXmlManager::listTestsNotice
00:02:30.553 
00:02:30.553 Running `phpunit --list-tests-xml` to get a list of expected tests ... 
00:02:30.553 
00:02:30.976 > phpunit '--list-tests-xml=tests-list-extensions.xml' '--testsuite=extensions'
00:02:31.080 Using PHP 7.4.33
00:02:31.081 Running with MediaWiki settings because there might be integration tests
00:02:43.430 PHPUnit 9.6.21 by Sebastian Bergmann and contributors.
00:02:43.430 
00:02:43.430 Wrote list of tests that would have been run to tests-list-extensions.xml
00:02:43.463 > MediaWiki\Composer\PhpUnitSplitter\PhpUnitXmlManager::splitTestsListExtensions
00:02:44.688 Encountered PHPUnit ErrorTestCase - check for a syntax error in the test suite or an error in a dataProvider!
00:02:44.689 Test suite splitting failed - falling back to linear run
00:02:44.689 Running extensions phpunit suite databaseless tests...
00:02:44.692 Running command ''composer' 'run' '--timeout=0' 'phpunit:entrypoint' '--' '--configuration' 'phpunit-database.xml' '--testsuite' 'extensions' '--exclude-group' 'Broken,ParserFuzz,Stub,Standalone,Database'' ...
00:02:45.865 > phpunit '--configuration' 'phpunit-database.xml' '--testsuite' 'extensions' '--exclude-group' 'Broken,ParserFuzz,Stub,Standalone,Database'
00:02:45.865 Could not read "phpunit-database.xml".
00:02:45.865 Script phpunit handling the phpunit event returned with error code 1
00:02:45.865 Script @phpunit was called via phpunit:entrypoint
00:02:45.865 Test suite splitting failed
00:02:45.894 INFO:quibble.commands:<<< Finish: PHPUnit Prepare Parallel Run (Composer), in 15.817 s

There's a such pipeline blockage in my latest patch. What should I do to address it?

Hi @Unite_together,

After reviewing the work you’ve started, it’s clear that building out the REST API for lexemes requires a level of expertise and experience that our current core team needs to handle internally when the time comes. At the moment, our engineers are completely committed to priority work, and unfortunately, they don’t have the bandwidth to review or support this project as it stands.

To keep things productive for everyone, I’d encourage you to explore some of the other open issues or smaller, well-scoped projects to help you build up valuable experience and have a positive impact.

When and if we decide to formally start work on the REST API for lexemes, we’ll make an announcement and share clear guidelines and opportunities for community involvement.

Thanks again for being so enthusiastic about working on the topic!

Change #1147706 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Implemented GetLexemeLemma API

https://gerrit.wikimedia.org/r/1147706

Change #1148302 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Implement GetLexemeStatement API

https://gerrit.wikimedia.org/r/1148302

Change #1148319 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Implemented AddLexemeStatement API

https://gerrit.wikimedia.org/r/1148319

Change #1148497 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Implemented PatchLexemeStatement API

https://gerrit.wikimedia.org/r/1148497

Change #1148500 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Implemented RemoveLexemeStatement API

https://gerrit.wikimedia.org/r/1148500

Change #1148797 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Implemented SetLexemeLemma API

https://gerrit.wikimedia.org/r/1148797

Change #1148807 had a related patch set uploaded (by Unite together; author: Unite together):

[mediawiki/extensions/WikibaseLexeme@master] [Part of T329096] Implemented RemoveItemLabel API

https://gerrit.wikimedia.org/r/1148807