Neural Networks for Video Generation: 6 Simple Services
Technology

Neural Networks for Video Generation: 6 Simple Services

24 min read

Neural networks have learned to generate videos so realistic that they are difficult to distinguish from real ones.

However, open models cannot yet create a video longer than a few seconds and do not follow the request very accurately. Therefore, to achieve a good result, you will have to make many attempts. I will tell you about neural networks that allow you to generate videos for free.

Choose a neural network for video generation

How to write queries for neural networks In any service, you need to write a prompt — a text request, that is, describe the result you want to get. Almost all neural networks offer to upload a reference — a picture, based on which the video will be generated. As a rule, it is easier to achieve a good result with reference images, because the neural network has something to rely on. Otherwise, it will have to generate both the picture and the animation from scratch.

Prompts are usually written in English. The standard request scheme looks like this: style - object - action - environment - camera - lighting.

It is not necessary to specify all the parameters. The main thing is to describe the object and the action. The rest is optional. If you are generating from a picture, you need to tell in the request what is happening in this picture. Here are the words you can use:

style: cinematic action (movie scene), animation (animation), black and white film (black and white film); object: woman with red hair, siamese kitten, lonely house; action: walking, smiling, rolling; environment: rooftop, medieval castle, cityscape; camera position or lens type: wide angle, close up, long shot; Lighting: sunset, warm lighting, moonlight, studio lighting.

Free Course of the T⁠—⁠Zh Textbook “How to Simplify Life with the Help of Neural Networks”

For a fair experiment, I tested all neural networks on the same prompts. The requests were as follows:

the person is reading the newspaper in a coffeeshop (generated only by text query); cat jumping on blue plastic chair floating in air above clouds (generated by text query and image); a cat sits on a soviet cabinet, walking away, yellow lighting (generated by text query and image). 1 / 2 In this picture the neural network will have to finish drawing the cat Header Label Runway What it can do: generates based on a text request and a picture How well it generates: ⭐⭐⭐ How many free attempts: 65 seconds of video How much does a subscription cost: from $15 (1470 ₽) per month

Minimalistic Runway editor. On the left you can upload an image and enter a text query Minimalistic Runway editor. On the left you can upload an image and enter a text query Runway is a neural network that is known for creating cinematic videos. It can also make smooth transitions between frames, so it is often used for neural network short films or clips.

How to use. Register. On the main page, click Start a new session. The editor will open, where all the work takes place. Four modes are available:

Generation by text request and image or video. Generation with camera control. Allows you to control the camera position during generation, which gives you more freedom when creating a scene. Capture video for subsequent generation. You can record a video with your face so that the service can create the character’s emotions and movements based on it. Video continuation. The function extends an existing video. For initial attempts, select the first option and enter a query in English in the field. In the settings, you can select the resolution of the sides, the video duration - 5 or 10 seconds, and the neural network version - Gen-2, Gen-3 Alpha or Gen-3 Alpha Turbo.

Gen-3 Alpha Turbo is a fast version of the current model, Gen-2 is outdated, Gen-3 Alpha is only available with a paid subscription. In Turbo mode, you cannot generate only by text request: you will also have to upload a picture.

Model Selection: Gen-3 Alpha is available by subscription only Model Selection: Gen-3 Alpha is available by subscription only The neural network will refuse to generate if your image contains objects of intellectual property protected by copyright or simply celebrities. For example, it will not process an image with Nicolas Cage or Shrek.

Generating videos costs credits, five for every second. Runway gives 325 credits for registering a new account. That’s enough for 65 seconds of video created by the Gen-3 Alpha Turbo model. Credits don’t burn out over time, but they won’t be replenished either. To get more, you’ll have to register a new account.

The finished video can be continued, a new video can be generated from it, edited or increased in quality to 4K resolution. All this costs additional credits.

Everything you can do with a finished video Everything you can do with a finished video What happens. It is impossible to generate for free only by a text query, so you can only evaluate the results by the picture. You have to spend several attempts on one video, since Runway does not always accurately follow the query, even if it is short.

For example, in my queries one cat split into two and became a monster, and the other one didn’t want to leave the closet. Regeneration helped, fortunately there are enough free attempts to run one query several times.

But another problem remains: when animated, the character changes and becomes more “neural” than in the original picture. The dynamics of Runway are also not ideal: the cats move too slowly.

The cat on the cupboard got a little smeared

A very slow jump. And still didn’t jump.

👍 Pros:

Enough free tries. Smooth transitions.

👎 Cons:

Doesn’t follow the request clearly. Generation only by text request is paid. Refuses to generate known characters. Header Label

Pika What it can do: generates based on a text request or a text request and a picture; adds ready-made effects to videos How well does it generate: ⭐⭐⭐⭐ How many free attempts: two videos per day, but no more than 16 per month per account How much does a subscription cost: from $8 (784 ₽) per month

Pika’s main page. Below is a field for entering a request, uploading an image, and selecting an effect. There are no other settings. Pika’s main page. Below is a field for entering a request, uploading an image, and selecting an effect. There are no other settings. Pika is a video generation neural network that focuses on visual effects. It has gone viral on social media thanks to its AI filters that transform images. For example, they deflate objects like a balloon or cut them like a cake.

How to use. After registration, you will be taken to the main page with other people’s generations and a field for entering a request. There are several modes:

generation by text request; generation by image and text request; Pikaffect — visual effects for animating images using a template. Sound is also added to ready-made videos. There are 16 effects in total: flatten, tear, dissolve, inflate, explode, crush, melt, and so on; Ingredients — the ability to upload several images, take a visual style from one, a character from another, and then generate a video based on this. Select effects. Each has a preview Select effects. Each has a preview The easiest way is to use Pikaffect. Just upload a picture, choose a filter you like, click the generate button, wait for processing and save the video. There are no settings here. And if you don’t like the result, you can regenerate the video by clicking Retry.

To create a video from scratch, write a prompt in English in the request field, and upload a picture if desired. In the settings, you can specify:

anything that shouldn’t be on the video - such a request is called negative; aspect ratio - only available for generation by text request. If you upload a picture, the aspect ratio is set automatically; Seed — a digital designation of a specific generation. In this field, you can enter the seed value of the previous video to repeat the visual style. When generating from text or an image without effects, Pika processes the request much longer - be prepared for this.

1 / 2 Model selection. Only one is available for free - Pika 1.5. But the service is cunning and by default installs Pika 2.0, and after clicking the Generate button asks to pay for a subscription One generation costs 15 credits. Each user receives 250 credits for free upon registration - they are replenished every month. That is, you can generate 16 free videos monthly. At the same time, Pika allows you to generate only 2 videos per day.

If this amount of videos is not enough, you will have to buy a subscription. Another option: register a new profile. However, in the free version the service adds a huge watermark to the entire video.

How to Use the Pika Neural Network That Blows, Explodes, Melts, and Cuts Pictures Like Cake

What happens. Pika does a great job of generating images based on a text query alone: ​​the colors are rich, the compositions are beautiful. There aren’t too many artifacts, but they are there .

The generation from the picture didn’t work out very well: the cat turned into a too contrasting and unlike itself, it moved unnaturally. It is clear that the neural network is worse at dynamics than scenes with a little animation.

The effects prepared by the service work perfectly, but only if you select a picture with a character that takes up most of the image. It simply does not recognize small objects and attaches the effect to empty space.

The neural network came up with an entire scene in a coffee shop, although this was not included in the original request

Generate a cat on a cabinet only by text request. Nice colors, but the shadow is wrong

The cat on the cupboard has changed a lot and has become acidic - this is a minus

The effect of cutting a cat like a cake

👍 Pros:

Easy to use. There are ready-made effects.

👎 Cons:

Strict limits. It takes a long time to generate if you don’t select effects.

Header Label Hailuo AI What it can do: generates by text request, by character, or by image and text request How well it generates: ⭐⭐⭐⭐⭐ How many free attempts: about 30 videos How much does a subscription cost: from $10 (980 ₽) per month

The editor is very simple. On the left is a field for entering a query, on the right is a field for generating The editor is very simple. On the left is a field for entering a query, on the right is a field for generating Hailuo is a Chinese neural network that is used to animate images and create memes. For example, at the end of 2024, it was used to make videos about a dog with an apple in its mouth. It also creates realistic videos that are often mistaken for real on social media.

How to use. After registration, on the main page, click on the Create section. You will be taken to the editor, where three modes are available:

Text to Video generates a video based on a text query. Image to Video generates a video based on an image and a text query. Subject Reference generates a video of a specific character based on their image or photograph. Select the mode you like, enter text or other content. There are no complicated settings - only the choice of the model, the number of generations at a time and the Enhance Prompt button. It rewrites and improves the prompt so that it becomes more understandable for the neural network. I recommend not disabling it.

There are only two models: the basic I2V-01 and the new I2V-01-live. The second one copes well with anime, is suitable for animation and bringing 2D pictures to life. If you need realism, then choose the basic one. When you are done, press the button with the shell - it will start the generation.

This is what choosing a model looks like This is what choosing a model looks like One generation spends 30 credits. After registration, you will be credited with 1000 credits, which must be spent within three days. Another 100 credits are added every day. They are not carried over to the next day: they burn out if they are not spent. It turns out that in the first three days you can create 33 videos, and then generate three per day. This is the most generous offer among all video generation services.

The results of the generations appear to the right of the editor. They can be saved, shared, or remade using the Reсreate button. Regeneration will cost another 30 credits. The video cannot be edited.

Results. The videos look dynamic. The characters move actively, there is no slow-motion effect in the videos, as with many other services. A big plus is that there are a lot of free attempts, so the same request can be run at least ten times. One of these results will definitely be good.

If you generate only by text query, the results are quite unpredictable. Therefore, I recommend generating by picture — Hailuo copes with this perfectly. The videos are realistic, and the characters do not morph, that is, they do not transform into radically different characters.

Video for the text query “Man reading newspaper in a coffee shop” - good generation without obvious artifacts

Generated from a picture of a cat on a chair. Jumped smoothly and didn’t fall apart into artifacts

Generation from a picture with a cat on a cabinet. Beautifully left, although a little unnatural

👍 Pros:

Lots of free tries. Dynamic and realistic videos. Easy to use.

👎 Cons:

There are practically no settings. Header Label Kling

What it can do: generates by text request or by image and text request; creates lip sync videos How well it generates: ⭐⭐⭐⭐⭐ How many free attempts: about 10 videos per month How much does a subscription cost: $10 (980 ₽)

Editor in Kling. On the left is the field for entering a query and settings, on the right are the results of generation Editor in Kling. On the left is the field for entering a query and settings, on the right are the results of generation Kling is a Chinese video generation service that has become popular due to its high quality and realistic videos. Initially, it was only available to users from China, but is now open to everyone.

How to use. Register and click on AI Videos. This will open Kling Creative Space, the main space for creating videos. There are three modes available:

Text to Video - generation based on a text request. Image to Video — generation based on a text request and an image. Lip Sync — generates lip syncs from video. Allows you to upload a video with a character, as well as a voiceover or song. AI synchronizes the character’s lip movements with the sound. Works only with people, does not support animals. There are three models to choose from: Kling 1.6, Kling 1.5 , and Kling 1.0. The higher the value, the more current the version and the more expensive the generation. The Kling 1.6 model is often unavailable to free users due to “high server load.” I was never able to do anything with it: the service constantly offered to buy a subscription.

Model selection in Kling. The most current one is automatically installed, but it is almost impossible to generate anything in it on the free plan Model selection in Kling. The most current one is automatically installed, but it is almost impossible to generate anything in it on the free plan If you chose generation by text or image, then write a request and optionally upload an image. After the prompt field, there are additional settings that affect the result:

Creativity is responsible for the accuracy of following the request. The slider can be moved from Creativity to Relevance. Conventionally, from “make it beautiful” to “follow the instructions as accurately as possible”; Mode — generation modes: Standard is faster, Professional gives a higher quality picture. Only five test attempts in Professional Mode are available for free; Length — select the video duration: 5 or 10 seconds; Aspect Ratio — aspect ratio. The setting is available for generation only by text request; Generating Count — the number of generations. More than one can be installed only with a subscription; Camera Movement — control of camera movement. The setting is available only with a paid subscription; Negative Prompt — a negative request, i.e. a description of what should not be in the video. The service does not always follow the instructions from it exactly — it can ignore some of the requirements. Some of the settings in Kling. To see everything, you need to scroll down Some of the settings in Kling. To see everything, you need to scroll down You can leave the default settings and do not touch anything. Creativity will be set to 0.5, video duration - 5 seconds, horizontal frame, aspect ratio 16:9. When you are done, click Generate and wait for the result.

One generation costs 35 credits. For registration, the service charges 366 credits. They are replenished every month. In total, you will be able to generate 10 videos, then register a new account or buy a subscription.

Compared to other services, Kling generates very slowly - some requests take three hours. It may happen that you waited for several hours, and the request ended with an error. Only lip sync videos are generated quickly.

How to Generate Videos for Free in Chinese Neural Network Kling

Results: Kling does a great job of generating realistic videos. There are virtually no artifacts visible in the videos. Characters do not transform into other characters right in the middle of the video — the morphing problem that many other services face is virtually absent.

Kling’s strong point is its dynamics. The movements are smooth and energetic, there is no “frozen picture” effect or strange pauses between frames. Even with complex movements - for example, a cat needs to jump onto a chair hanging in the air - the quality remains stable. The only downside is that on the free plan you have to wait a very long time for results.

Only the generated font looks unnatural on the newspaper, otherwise it’s not bad

The cat jumps onto the chair smoothly and convincingly, there are no serious artifacts

The cat looks around realistically and descends

👍 Pros:

Realistic and high-quality videos without artifacts. Lots of free generations.

👎 Cons:

Most of the settings are available only by subscription. Very long generation. The current version is available by subscription only. Header Label Genmo What it can do: generates based on a text request How well it generates: ⭐⭐⭐ How many free attempts: 2 videos every 6 hours, but no more than 30 per month How much does a subscription cost: $10 (980 ) per month

Genmo editor. There is a prompt field in the center of the screen. Generations appear at the bottom Genmo editor. There is a prompt field in the center of the screen. Generations appear at the bottom Genmo provides access to the Mochi-1 model as open source, meaning you can download it, install it on your computer, and customize it. The Mochi-1 website offers a simple web interface , without any complicated installation hassles.

How to use. On the main page, click Try Now and register. An editor will open with a line for entering a request and a drop-down menu with settings. The results of your generations and the work of other users will appear below.

There are two models to choose from - Mochi 1 and the outdated Legacy v 0.2. The latter has many settings and generation by image is available, but the quality is significantly worse than the current one, so I do not recommend using it.

Mochi 1 has no settings at all. You can set a seed, remove the watermark and generate the video privately, but the last two options are only available with a subscription.

Selecting a model and settings Selecting a model and settings In the field for entering a request, write a prompt. You can use Russian, but the neural network understands it worse than English. If there are no ideas, then under the line there is an option Need prompt ideas. Click on it, and a random idea for generation will appear in the field. When finished, click on Generate.

You can generate two videos every six hours and about 30 videos per month for free. The result cannot be edited or modified. You can only download, try again or change the prompt.

Results. Visualizing people is difficult for the neural network: in my request, it refused to generate a person and focused on the newspaper. At the same time, even in such a video there were many artifacts and blurred space.

The neural network also ignores parts of the request - for example, the cat did not jump off the cabinet, but remained sitting there. But such animation without dynamics looks good. Mochi 1 is suitable for relatively static scenes - complex movements are difficult for the neural network. A large number of generations smooths out this drawback.

After several attempts, it was still not possible to generate a person - he constantly hides behind a newspaper

But the cat on the cupboard is good. True, it doesn’t jump off and just sits

The craziest generation, which shows that neural networks can’t do everything, but they can amuse

👍 Pros:

Lots of free generations.

👎 Cons:

Generates poorly in dynamics. Ignores part of the request. There are no settings or generation from a picture.

Header Label Pixverse What it can do: generates based on a text request, image or video How many free attempts: 5 videos on the first day, then 2 daily How well does it generate: ⭐⭐⭐ How much does a subscription cost: from $10 (980 ₽) per month

This is what the editor looks like. Below is a field for entering a query and settings This is what the editor looks like. Below is a field for entering a query and settings The Pixverse service is distinguished by the fact that it takes about 10 seconds to generate a video - this is the fastest neural network in the selection. For such speed you have to pay with quality.

How to use. Register and click Create. A field for entering a query will appear. Enter a query in English, if desired, upload a picture or video for reference. After that, set the settings:

model - V2.5, V3 and V3.5. Choose the last one, because it is the most up-to-date and generates the best quality; effects — filters that will write a request for you. All you have to do is enter the object. There are filters: Hulk, Joker, Santa, Lego, zombie and others; style - anime, 3D, comics, cyberpunk, clay characters; transition - the ability to load the next reference frame to transition to it; character - the ability to upload a picture with a specific character and generate a video with him; video time - 5 or 8 seconds; resolution of parties - available only when generated by text request; negative request - something that shouldn’t be in the frame; Seed — a digital designation of a specific generation. You can enter the seed value of a previous video to repeat the visual style. Once you have set the settings, click Create. One generation uses 30 credits. After registration, you get only 130. After that, 60 credits will be credited every day. Unspent credits will burn out.

The result of the generation will appear in a few seconds above the field for entering the request. It is possible to regenerate the video, extend it by five seconds, increase the quality or download it. You can also make the character lipsync. Any manipulations with the video are paid for with credits.

Results. The videos turned out to be quite realistic. The neural network tries to follow the request, although it does not always succeed. For example, with the cat, there was a funny, but quite typical hallucination - the animal tries to jump on a chair, but falls through it.

Relatively static videos with a little animation are not bad - the girl in the coffee shop looks acceptable. When it comes to movement and dynamics, the result looks unnatural. In addition, there are strange defects and distortions that are visible to the naked eye.

The girl is quite realistic, but the jerky movements and the text on the newspaper give away the neural network

The cat falls into the chair in a funny way

The cat morphs a little, but there is an interesting scene in which he jumps from the cabinet onto the table.

👍 Pros:

Very fast generation. The neural network follows the request well.

👎 Cons:

Doesn’t handle dynamics well. Artifacts are noticeable.