Random Data Generator

Random Data Generator is for filling database tables with random data. We use reverse regular expressions patterns for setting how the data will be generated. The generator patterns are saved for each column in the project file, so you don't have to define them each time you access this feature again. Important is to save the .dbs project file.

There are two major settings which you have to define :

  • Select the tables to fill in with data. This are usually the tables from the current opened layout.
  • Choose the pattern to use for each column. You can set them from the random data generator dialogs or directly in the layout by right-clicking each column.

Start the Random Data Generator from the Data Tools application menu or from any table right-click popup.

The first dialog shows the tables you to generate data for. Here you can choose for each table the number of rows to generate. Double-click any of the tables to set the generator pattern for each of the columns.

The order of tables is established by DbSchema, fixing the dependencies between tables via foreign keys. If a table 'Orders' is referencing 'Customers', the table Customers will be included first in the generation list.

Random Data Generator

After opening this dialog DbSchema will try to check if the tables has data or not. When clicking 'Generate' it will ask if the current data should be dropped. Please make sure you don't drop data on production databases ! Carefully read DbSchema messages.

Generator Patterns

From the first dialog double-click any of the tables to enter the current dialog.
Random Data Generator Patterns

Here you can edit:

  • Nulls - is a red (*) if the column is mandatory ( NOT NULL). In this case the Random Data Generator will always generate a value. If not, you can set a percentage of values to be NULL.
  • Seed - is an integer. If you set two columns with the same value, they will generate the same sequence of values.
  • Pattern - is the generator pattern. See below a description of available patterns.

The generator pattern can be any text. With some exceptions the text is interpreted as reverse regular expression text. Here are the different possibilities:

Random numbers

Use a pattern like int:from=0;to=10000;. Instead of int you can use integer, long, double, float or short. The from and to values are the interval where the values are generated.

For double and float you can use double:from=0;to=10000;format=####.##. The format is used to format the decimal digits. It is based on the java class java.text.DecimalFormat.

Samples:

int:from=0;to=100

Generated data : 12, 64, 34....
double:from=0.5;to=5.5;format=#.##

Will generate random doubles with two digits after comma : 0.53, 0.12....

Sequence

sequence:from=0;step=1;

Generates number starting at 'from' and increasing with 'step'

Random Date and Timestamp

Dates and timestamps have a pattern like date:from='01.01.2008';to='01.01.2009'; or timestamp:from='01.01.2008 00:00:00 000';to='01.01.2009 00:00:00 000';

We use for parsing the patterns and generating the values the date and timestamp formats specified in Edit menu / Configuration Dialog.

Values from File

list:<path_to_file>

The path_to_file should point to a text files. The generated values will be random lines of text from that file. The default files included in DbSchema are located in the dbschema.jar in the /generator folder. The files added later to the application will be saved in /Users//.DbSchema/config/generator.

Foreign Key Columns: Values from Primary Key Table

If your table contains columns referring primary key columns via foreign keys, the current column should contain only values which are listed in the primary key column. Using the pattern : 'values from primary key table' the random data generator will load the existing values from the primary key column and use one of this for generating values in the referring column.

The primary key table should contain at least one record so this pattern can work. Also if the actual referring column is set to be unique, the maximal number of records which can be generated limited to the number of records in the primary key column.

Reverse Regular Expression Text

Any text will be treat as a reverse regular expression string. Random text will be generated based on the current text.

Sample:

(My|Your|Their) friend (John|Mike)
    
Will generate one of :
My friend John
Your friend Mike
....
    

Boolean

May use percent_true as double between 0 and 1 to set the density of 'true' versus 'false' values. Percent_true default value is 0.5 .

boolean:percent_true=0.7

You can easy choose a predefined pattern from the Pattern Repository Dialog by clicking the '...' button in the above dialog.

Java Regular Expressions

Matching symbols
.           Matches any single sign, includes everything
^xxx        Matches xxx regex at the beginning of the line
xxx$        Matches regex xxx at the end of the line
[abc]       Can match any of the letter a, b or c. [] are known as character classes.
[abc][12]   Can match a, b or c followed by 1 or 2
[^abc]      When ^ is the first character in [], it negates the pattern, matches anything except a, b or c
[a-e1-8]    Matches ranges between a to e or 1 to 8
xx|yy       Matches regex xx or yy

Regular Expression Quantifiers

Quantifiers specify the number of occurrence of a character to match against.

x?          x occurs once or not at all
X*          X occurs zero or more times
X+          X occurs one or more times
X{n}        X occurs exactly n times
X{n,}       X occurs n or more times
X{n,m}      X occurs at least n times but not more than m times
Expressions Metacharacters
\d	        Any digits, short of [0-9]
\D	        Any non-digit, short for [^0-9]
\s	        Any whitespace character, short for [\t\n\x0B\f\r]
\S	        Any non-whitespace character, short for [^\s]
\w	        Any word character, short for [a-zA-Z_0-9]
\W	        Any non-word character, short for [^\w]
\b	        A word boundary
\B	        A non word boundary

Combine Values from Two Patterns

$First Name $Last Name

Use $PatternName to refer a predefined pattern. Predefined patterns can be edited in the Pattern Repository Dialog.

Java Groovy Patterns

groovy: income_cost + vat_cost

Java Groovy patterns can be used to do custom computation based on values already generated for other columns. For example if you have two other columns in the same table 'income_cost' and 'vat_cost', you can add them and place the generated result into another column. This pattern can use full groovy logic like :

groovy: if ( cost > 10 ) return cost*2; else return cost +5;
Supposing we have two columns start_date and end_date, we can generate an end_date after start_date with no more then 1 week, using:
groovy: return start_date + (int)(Math.random()*7)
Groovy is a Java based scripting logic. Please check in internet for Groovy tutorials.

Pattern Repository

The pattern repository is storing predefined patterns. DbSchema tries to automatically fit one of this patterns for each of the columns, so you can start generating data just in few clicks !
Random Data Generator Patterns Repository

Solve Errors

During data generation different errors may happen. In the Generator dialog click one of the tables and the errors will show in the bottom. You may click also the error to read more details.

At this point you have options to view the sample generated data or view the actual data in the table. Fixing the problem would mean to edit the pattern for the problematic column.
Solve Generator Errors