Overview:
Three different aspects in context of AI model training are patentable:
-
Generating training data for use in training the AI model;
a. Generating training data for use in training the AI model;
b. Training the AI model using the training data (see AI Basics Part 3: How can the AI model (neural network) learn from data? and Patentability of AI – Part 3: Claim directed to technical application field (“Applied AI” – Dimension 2)); and
c. Using the trained AI model during inference (see
Patentability of AI – Part 3: Claim directed to technical application field (“Applied AI” – Dimension 2))
Most AI-based inventions concern applied AI (i.e. applied machine learning). These inventions rely on open-source frameworks (e.g. from OpenCV or Huggingface) and thus do not comprise any differences inside the used mathematical model / neural network in view of the prior art.
However, the used open-source AI models are often trained or finetuned on a particular task using a tailored training dataset. Hence, in supervised machine learning the actual innovation often simply lies in the used training dataset.
Therefore two questions may arise:
-
Is a training dataset patentable or is a model trained with this dataset patentable? In brief: yes, we will show you below how.
-
Is it required to publish the complete dataset? Happily no, but consider our recommendations below.
Step by step:
As you may already have guessed, an invention related to a particular training dataset concerns Dimension 2: Invention directed to a technical application field (“Applied AI”).
The EPC guidelines point out that the generation of a training dataset and the training method using the dataset can be technical: “Where a classification method serves a technical purpose, the steps of generating the training set and training the classifier may also contribute to the technical character of the invention if they support achieving that technical purpose” (EPC Guidelines G-II, 3.3.1)
Accordingly, it is decisive whether the purpose of the underlying dataset is a technical one or not. Hence, it does not matter that the data samples contained in the dataset are might also be useable in non-technical applications, as long the dataset has a technical purpose. It appears important to understand the difference between application and purpose here: We do not care what is actually done, i.e. an “application” (in any case, a dataset does never do anything, or have you ever seen a calling telephone book?). It is only important what can be done using the dataset (e.g. calling all people in the telephone book and invite them to your best friend’s house party).
Moreover, it seems advisable to describe the algorithm for generating/obtaining the dataset in detail in the patent application. Assuming your dataset has a technical purpose, each feature of said algorithm is to be considered as supporting achieving this technical purpose. Therefore, each of these features is to be considered when assessing the inventive step. Accordingly, each of these features can be a valuable fallback position for claim 1. Likewise, any particular characteristics of the training set itself, e.g. of the properties and format of the training labels can contribute achieving the technical purpose and should thus also be described in detail. For example, think of a particular form of a segmentation mask (= the label) used to annotate the image samples of the training set.
Beside the generation of the training set, also the method of training the AI model can be patentable, as well as using the trained model at inference (i.e. in its intended application). Respective claims may merely refer back to the training set generated in a method according to claim 1. However, in some cases the training method may have own inventive features, cf. the example discussed in the blog article “Patentability of AI – Part 3: Claim directed to technical application field (“Applied AI” – Dimension 2)“.
Importantly, it is NOT required to publish the whole dataset (which often is the actual value of the Applicant). As pointed out in the EPC Guidelines G-II, 3.3.1 , if the technical effect is dependent on particular characteristics of the training dataset used, those characteristics that are required to reproduce the technical effect must be disclosed unless the skilled person can determine them without undue burden using common general knowledge. However, in general, there is no need to disclose the specific training dataset itself (also cf. the blog article “Patentability of AI – Part 4: Requirements to the disclosure of the AI invention (Art. 83 EPC)“)
Have a look to the following example for dimension 2:
Claim 1 of PCT/EP2019/063650:
- A method for training a generative adversarial model generating image samples of different brightness levels, comprising the steps of:
a – obtaining (SOI) a set of training image data comprising for each of a plurality of training images an input image sample and a target image sample representing the same image but in a different brightness level,
b – providing (S02) an image generating model having an encoder configured to receive the input image sample and a plurality of decoder branches configured to output each a generated output sample,
c – providing (S03) a discriminator model,
d – training (S04) the image generating model using the set of training image data based on a predefined loss function, wherein for each training image among the decoder branches only the decoder branch whose generated output sample has a minimum loss compared to all other decoder branches is optimized based on said predefined loss function, and
wherein the training step (S04) of d is augmented by an adversarial loss which is based on the output of the discriminator model.
-
Technical character of claimed invention has not been objected in ISR
-
Generated image samples can have a technical purpose: train an automated driving system to learn driving in different daylight conditions
-
However: Claim 1 is NOT limited to the technical application, i.e. the automated driving system (not objected by the EPO!)
-
Said invention proposes to train a GAN model to generate night-time images based on provided daytime images. The motivation is straight forward. There exist large datasets of annotated training images made at daytime for training an automated driving system. However, only few respective datasets made at nighttime or at a twilight are available. Accordingly, the invention allows to train an AI model on different daylight conditions without the need to manually annotate nighttime images.
Note however that claim 1 leaves it open, whether the generated images are actually used to train an automated driving system or not. In other words, claim 1 actually also covers non-technical applications! However, with a convincing and detailed descriptions which focuses on the technical purpose(s) of generated dataset, the EPO does not seem to require any further limitation in claim 1. In other words, the GAN model of claim 1 can also be used to generate images without a technical purpose, e.g. a full-moon version of Da Vinci’s Monalisa for a poetry collection.
Short excursion: What is actually a GAN (generative adversarial networks) model?
A GAN is basically trained to generate images (or other data samples). The concept was initially developed by Ian Goodfellow and his colleagues in June 2014 (have a look to his paper).
Yann LeCun, Facebook’s chief AI scientist, has called GANs “the coolest idea in deep learning in the last 20 years.”
A GAN uses two neural networks to competitively improve an image’s quality. A “generator” network creates a synthetic image based on an initial set of images such as a collection of faces or in our example nighttime images. A “discriminator” network tries to detect whether or not the generator’s output is real or fake. In case the discriminator “wins” (i.e. could correctly discriminate the fake image from a real image), the generator is penalized (i.e. optimized) and vice versa. This cycle is repeated several times, until the discriminator can no longer distinguish between the fakes generated by its opponent and the real thing. The ability to create high quality generated imagery has increased rapidly.
After this concurrent training of generator and discriminator, the generator can be used to produce realistic fake images. In other words, the discriminator is only used for training of the generator.
But wait: Why is this GAN approach relevant for understanding the patentability of datasets?
Because a Gan can also generate fake images which do not have a technical purpose!
For example, the organization Obvious has trained a GAN model on a set of 15,000 portraits from online art encyclopedia WikiArt, spanning the 14th to the 19th century. The trained GAN model was used in 2018 to generate the portrait painting “Edmond de Belamy” which has been sold in a Christie’s auction for $432,500.
However, in contrast to the generated nighttime images of the example above, “Edmond de Belamy” is rather an aesthetic creation in the sense of Art. 52(2)(b) EPC without any technical purpose.
Hence, the same (GAN) technology is used once for serving a technical purpose and thus constituting patentable subject-matter, and once not.
***
Summarized, it mainly depends on the way how you describe the (technical) purpose of your generated training dataset. A good and detailed explanation of the possible technical purpose can be decisive for the patent grant, even though the claimed technology could also be used in non-technical applications. Hence, re. the technical application of the invention: Keep claim 1 general and the description specific!
author: Christoph Hewel
email: hewel@paustian.de
(photo: Pierrevert, PACA, France. Training data is as valuable as lavender oil. Only they don’t smell as good.)