AmiBroker Users Knowledge Base Setup A Custom Database (v3)
Post on: 8 Июль, 2015 No Comment
October 17, 2007
Setup A Custom Database (v3)
The objective for this tutorial is to setup an example database, including grouping and sector classification. Some basic database management skills are demonstrated and then built on in a later Users Knowledge Base (UKB) article: Setup A Custom Database Nasdaq
- - AmiBroker: Standard v4.9
- - Amiquote: v1.94 registered
- - Time Frame: Daily
- - Data Source: Yahoo
- - Data Type: US Stocks
- - OS: Windows XP (Home)
- - Software: Microsoft Office XP Professional 2002
- - Browser: Internet Explorer v7.0
It is assumed that readers are familiar with the preceding articles in the UKB >> Database Management   sub-category and also the AmiBroker Help >> Tutorial >> Database Management section.  
Note: To see articles in their order of precedence it is necessary to go the the UKB Homepage and browse the Table Of Contents. In all other places, including RSS feeds, UKB searches and category filings, the posts are sorted by the publication date, which is often the reverse of the order of precedence.
AmiBroker is a flexible program, that, in many areas, allows owners to use it according to their personal preferences. This is nowhere more evident than when sourcing data. Data can be obtained from providers, who manage the classification and categorization of stocks for their subscribers, or, for those who want to do their own thing, custom databases can be setup using alternative or unique categories.
There is no set nomenclature, or structure, for categorizing database constituents, other than the limits set by the program and by the way in which the markets work:
- - Markets, Groups, Sectors and Industries   are major, market orientated, categories.
- - Watchlists and Favorites are minor, personally orientated, categories.
- - Indexes is a market orientated category grouped with the personals. 
- - All categories can contain sub-categories except for Favorites and Indexes .
- - Sub-categories under the major categories are mutually exclusive i.e symbols can only belong to one sub-category at a time.
- - When symbols are re-classified within a major category they are effectively moved from one sub-category to another.
- - Watch list sub-categories are not mutually exclusive i.e. symbols can belong to more than one sub-category.
- - Sub-categories only go down one level, except for Sectors, that goes down two i.e. Sectors > Industries.
- - Unclassified (Unassigned, Undefined) should be used for symbols that dont fit neatly into a defined sub-category.  
- - When symbols are assigned to two or more sub-categories the most recent operation will take precedence.
- - Only sub-categories that contain symbols will be visible, as a folder, in the Workspace Symbols List.  
This example imports a list of the Dow Jones Industrial Average (^DJI) members and divides it into some arbitrary groupings, to demonstrate the principles of categorization.
Note: The database comprises 32 symbols, some of which are redundant i.e. there is no data available for that symbol (redundant symbols have been included to demonstrate a point). It has been broken into two groups (Group A and Group B) and 6 sectors (DJI 0-5) containing five symbols each.  A redundant symbol has been added to the last two sectors (4&5).
CREATE A NEW DATABASE.
The first step in the process is to create a new database.
Sector and industry classifications are put in place when a database is created, so it is necessary to think about the required classifications before hand.
To create a database without classifications:
- 1) Make a backup copy of the original broker.industries and broker.sectors files in the C:/Program Files/AmiBroker folder (or the equivalent) then delete the original versions.
- 2) Download the resetindustries.txt and resetsectors.txt files attached to this post and save them in the C:/Program Files/AmiBroker folder to replace the original versions (rename them as broker.sectors and broker .industries ).
Note: The files can be opened and read in a text editing program like Notepad .
3) Create a new database called Data_CustomExample using the default EOD settings.
When the database is created the Sectors and Industries sub-categories will automatically be set to the broker file specifications i.e. in this example the first sub-category in Sectors and Industries will become Unclassified and the remainder will be set to the defaults i.e. Sector 1 63 and Industry 1 255 (this removes any redundant settings, for those categories, from the Symbol >> Categories or Symbol >> Organize assignments windows).
Note: Redundant titles can be retained in the Categories or Organize assignments lists from database to database. In practice they do not have any adverse effect on the current database. In this example, the Sectors and Industries sub-categories have been reset to the defaults so that readers, who are following the example, will achieve exactly the same outcome as shown.  Normally the broker files dont need to include the default names; the end result, in the Workspace Symbols Tree. would have been exactly the same if the broker files had listed Unclassified as the sole sub-category.
The Markets and Groups categories are not established when the database is created and they need to be set manually (if they are not reset the defaults will automatically be used, including any redundant sub-categories that remain from the previous database).
Note: When symbols are first imported into a new database they are entered under the first sub-category in each category by default (the exception is Groups that defaults to Group 255).
4) Go to Symbols >> Categories >> Markets/Groups and Edit the first name in the list as required then click on OK to save the changes.
For this example the Markets default needs to be NYSE and the Groups first sub-category needs to be Unclassified.  
Once the Markets and Groups have be named symbols can be imported into the database.
In this case a list of symbols, in Comma Separated Values (CSV) format, will be entered into the database using the Import Wizard .
Note: The lists used in this example can be downloaded, in Excel format, from the Attached Files section at the end of this post.  To create text files, suitable for importing into watchlists, copy the symbols from the relevant Excel column and paste them into a text editor, like NotePad .
To import from a CSV file using the Wizard:
1) Before importing symbol lists check the source file:
- - If the file is in Excel format Save it As a CSV file before proceeding.
- - It should contain only one Worksheet so Delete any additional Sheets.
- - Copy any columns that are constructed from underlying formulas and Paste special as Values to stabilize the  data.
- - Remove any extraneous material (Delete any notes or comments, other than the title row, and also any redundant columns).
- - Save the modified CSV file locally, using a meaningful name that associates the contents with an Industry sub-category.
Note: The ASCII Import Wizard and watch list importation use different formats. ASCII import allows for importing data to several fields, including the fullname, while watch lists are limited to ticker lists only. As a rule of thumb ASCII import should be used for global importing, and attribution of information values, while watch lists importation should be used to manage categorization i.e. move symbols from one category to another.
WARNING: THE IMPORT PROCESS WILL STRIP ANY ADDITIONAL  WORKSHEETS FROM THE FILE IF THEY ARE NOT MANUALLY REMOVED FIRST.
- 2) Open the Import Wizard using the File > Import Wizard command from the AmiBroker menu bar.
- 3) Click on Pick files.
4) Select the file to be imported from the file browser window that pops up and click on Open .
The selected file will be listed in the Wizards information pane.
 
The Define Fields window will open with a preview of the file format in the Data file sample pane.
Note: In this example the file has a single header row (title), the first data column contains the Symbol (Ticker) and the second column contains the Name (Fullname). The data columns are separated by a comma and there is no quotation data included.
6) Define the fields according to the file format:
- - use the drop-down menus to set Column 1 to Ticker. Column 2 to Fullname and the remaining columns to Skip (ignore) anything else,
- - set the Separator to Comma(,),
- - pick Unclassified from the Group drop-down menu,
- - skip the first line (title row) by entering 1 into the Skip first: input box,
- - check Log errors (optional),
- - check Automatically add new symbols (all of the symbols are new to the database and need to be added),
- check No quotation data (the list does not contain any quotation data as only Tickers and Fullnames are to be imported).
When the settings are correct click on Next > and the Additional Settings window will open.
- 7) For this example there is no need to save the import format. Click on Finish to complete the symbol/information importation.
- 8) Save the database using the File >> Save Database command from the menu bar.
Note: It isnt essential to save at this stage but it is a good precaution to take before continuing on.
All of the symbols are arranged in one market (NYSE), and they are Unclassified at the Group/Sector/Industry level, as expected.
This establishes Data_CustomExample as a global DJI database.
From there it can be categorized according to personal preference.
CATEGORIZE THE DATABASE CONSTITUENTS
Setup Group Assignments
Some planning is also required when assigning symbols to groups as the sub-groups need to be unique and the order in which the assignments are made can affect the outcome.
For this example the ^DJI database will be separated into Group A and Group B listings (once again this is for demonstration purposes only and it has no practical application in the real trading world).
To assign symbols to Groups:
1) Enter a name for the groups, in the order required, and click on O.K.
Note: The order is permanent i.e. sub-categories will not be alphabetically sorted in the Workspace window.
2) Import the A and B list symbols using the Wizard .
Pick Group A from the Group drop-down menu to assign the symbols to the relevant group.
Note: There is no need to check Automatically add new symbols since the groups are sub-sets of the global database (there are no new symbols since all the symbols in the group lists were already imported at the market level) .
Repeat the import procedure using the Group B file.
The Group assignments have now been installed (all of the symbols have been moved from Unclassified to the relevant group folder and the Unclassified group folder is empty).
Setup Industry and Sector Assignments.
Setting up the sectors and industries is a little more complicated and requires the use of an AFL formula.
The formula does the work of assigning the symbols to sectors but before it can be used some preparation is required:
- - the Sector and Industry sub-categories need to be named,
- - the watch lists need to be named to correspond to the Industry names (this is optional but it is a good practice to adopt to make sure that symbols end up in the right place).
- - downloaded lists, in CSV or Excel format, need to be converted into a watch list compliant format and imported into the database (once again it is recommended that the files are named to match the industry names),
- - price data needs to be imported  (AFL formulas will not work with symbols that do not contain quotes). 
Note: A handy tip, that the author picked up from Richard Dale of Norgate Investor Serices. is to append the sub-category number to the title e.g. when renaming List 0 to indicate that the folder contains DJIA components use the title DJIA 0 or DJIA (0) etc (once again this helps to ensure that symbols end up in the right place after a hard day at the office).
To name sector, industry and watchlist sub-categories:
1) Name the Watch lists so that they correspond to the Industry sub-category names.
Note: Start at the head of the list and do not leave any unnamed sub-categories in between the first and last named item.
2) Repeat the exercise for the Sectors sub-categories.
3) As there is no information available on industry classifications, for this database,  name the Industries in exactly the same way (this will establish a one to one relationship between the Sectors and the Industries ).
After the Sectors and Industries have been named the Sector that corresponds to the Industry will be listed in the Settings pane (click on any industry sub-category to see the associated sector).
The default Sector/Industry. in the Workspace Symbols Tree. has changed from Unclassified to Djia0 .
Note: The smart way to setup the Sectors and Industries would have been to list them in the broker files, prior to initializing the database. The names were entered manually, in this example, for demonstration purposes only (refer to AmiBroker Help >> Contents >> Tutorials >> Database Management >> Understanding Categories >> How to define your own sectors and industries for assistance with setting up 1-1 sector to industry broker files).
To prepare a watch list compliant file from a spreadsheet symbol list:
- 1) Select the column, to be copied, by clicking on the Column Title e.g. A.
- 2) Right click inside the selected area and Copy the symbol list using the context menu that opens.
- 3) Open a text editor, like NotePad. position the cursor in the white space of a new blank page, right click and select Paste from the context menu that opens  (this will produce a plain text file with a single symbol per line formats from the spreadsheet will be stripped from the text version).
- 4) Edit out the title rows, or extraneous text, if any, and save the file locally.
To import symbol lists to the corresponding watch lists:
1) In the AmiBroker Workspace right click on any symbol, in any symbol list, and select Watch list >> import from the context menu that opens (a Select Watch list(s) window will open).
Note: The folders in AmiBrokers Symbols Tree do not have a right click >> context menu, with the exception of watch list folders that contain symbols. The alternative path to the Watch list menu is  Symbol >> Watch list from the menu bar.
2) Select the Watch list that is going to receive the imports, select the Import/Export from /to: File radio button and click on OK (a file browser window will open).
3) Pick the text file that contains the ticker list to be imported and click on Open.
The watch list folder, containing the imported symbols, will automatically be established in the Workspace(repeat the import process for the remaining industry orientated watch lists).
Note: At this stage all symbols are still classified under the default Sector/Industry i.e. Djia0
It is commonplace for constituent lists to contain redundant symbols. In some cases the symbol has been changed and in others the company has ceased to trade or has undergone restructuring of some kind. Symbols that contain no quotes will not be assigned to categories by AFL formulas so it is necessary to download some data to the database to give the assignment code something to work with.
Current data will download from Yahoo orders of magnitude faster than historical data. Since only one quote is required, for AFL formulas to function, it makes good sense to use the Yahoo Current option for the download, especially when the procedure is applied to working databases containing thousands of symbols.
To download temporary data:
- 1) Open AmiQuote while AmiBroker is open with Data_CustomExample as the current database.
- 2) Get the tickers from AmiBroker using the AmiQuote  Tools menu and Automatically import data with Yahoo Current as the selected source. The status will be recorded as the download proceeds.
Note: AmiQuote does not show error messages, in current mode, for symbols that are redundant, or have no data to download.
The current data for the day will only be available on the Yahoo server for a limited time (it is reset to zero in time for the new trading day). In one trial, conducted by the author, current data, for the American market, ceased to be available from 4 AM Eastern Time (USA).
A quick check can be made to see if the current quotes are extant at the time that a download is required.
To check the availability of current quotes:
1) Enter a valid stock symbol in the Quotes input box. at any Yahoo! Finance page, and then click on Get  Quotes .
The Quotes Summary page will open.
2) Look at the Change report (if the current data is still on the server the change % will be shown).
Note: When the current data is deleted from the server the Change % report, and other outputs in the quotes panel at Y!Finance, reset to zero or N/A (the time might vary from day to day or with different exchanges for the American market). For international markets allow approximately four hours before the opening bell (the author only checked a limited number of international exchanges readers should check their own exchange for themselves).
Once the global database contains all of the required symbols, complete with data, the following AFL formula can be used to add symbols to a nominated industry, watch list by watch list.
To use watch list membership to assign symbols to industry groups:
1) Run an exploration using the X_CategoryAddSymbols formula, with all symbols selected and the Range set to n last days = 1.
Note: AddColumn(Close), in the formula, serves no practical purpose, in terms of assigning the symbols to the industry sub-categories; it is only there to allow Explorer to run.
Sectors and Industry classifications are automatically assigned as the formula is processed.
Note: SBC and HWP are not assigned because they do not contain data (no quotes symbols remain behind in the default industry folder i.e. Djia0, while all valid symbols are moved to their nominated industry folder).
2) Manually check and repair or delete no quotes symbols from the database.
Note: The quotes status, for databases containing only the current days data, can be seen in chart window for the selected symbol.
When the database quality has been confirmed, historical data can be downloaded to create a working database.
Note: To download historical data; open Amiquote. get the tickers from AmiBroker   and then download with the date range set to capture the required history (the historical download will overwrite the temporary current values).
POST CREATION FIXES
After a database is created it sometimes becomes necessary to make changes or repair any errors that are found.
There are a lot of different situations that can arise and they often require a creative approach to find ways to work around them. As a rule of thumb; use ASCII import to add symbols to the database, so that the information values can be appended, use watch-lists to categorize symbols in bulk or Symbols >> Assignments Organizer to manually move small numbers of symbols around and for data acquisition; import new symbol lists into AmiQuote (open a plain text list in AmiQuote as a .tls file) then download data for the new symbols and push it into the database, using the Automatic import checkbox in AmiQuote, (this approach allows a match-up between the imported date range and the existing range of the database).
One example from this database.
MSFT and  INTC are Nasdaq listings that are in the NYSE Market folder.
To move symbols from one market to another after the event:
- 1) Import a ticker list of symbols to be moved into a vacant Watch list folder e.g in this example List 6 is used since the first six Watch lists (0-5) still contain the sector component lists.
- 2) Run an exploration using the following AFL formula (an AFL version is attached in the Attached Files section at the end of this post).
After categorization of the database is complete any redundant Watch lists can be emptied.
To empty redundant Watch lists:
- 1) Right click on the Watch list folder.
- 2) Select Erase (make empty) from the context menu that opens.
- 3) Nominate the Watch list in the Select watch list(s) window that opens.
The symbols will be removed from the Watch list. but not the database.
Note: Empty Watch list folders will not be present in the Workspace Symbols Tree next time the database is opened.
That concludes this tutorial that demonstrates some basic database management procedures.
The method chosen is only one of several ways of achieving the same, or similar, outcomes e.g. the procedure could be described as a top-down approach but the bottom-up path, where the global database is compiled by importing Watch lists via the Import Wizard. could just as easily have been used.
It isnt necessarily the best possible way to go about it and users should experiment for themselves by mixing and matching the various data management tools available.
The base method will be used as a template, for a subsequent post, that will construct a working Nasdaq Composite database (some slightly more advanced points, that expand on some of the key concepts introduced so far, will be detailed there).
Logic for X_categoryAddSymbols
Passes all symbols for processing.
for(W = 0; W < 63; W++)
Is a form of counter (counters are standard fare in programming languages).
It comprises three statements:
- W = 0 assigns the number 0 to the variable W as the starting value,
- W < 63 sets an upper limit to the count,
- and W++ adds one to the count after the  following statements are executed.
if(InWatchList(W))
Places a condition on the statement to be executed i.e. the expression must be true for the following to be applied.
- <
- CategoryAddSymbol(,categoryIndustry,W);
The statement to be executed, (more than one statement could be included within the curly brackets to form a compound statement).
AddColumn(Close,Close,1.2);
Serves no purpose in assigning symbols to categories. It is required to allow the exploration to process the code.
More often than not, in example code, i is used as the variable in a counter. It has been borrowed from mathematics where, by convention, i represents the index of summation in Summation Notation.
The counter looping, used in X_categoryAddSymbols, is analogous to walking through an index of the variables, when the execution of the code is considered as a sequence of steps.
Note: for this example the upper limit for the count was set to 63, which is the maximum number of Watch lists available in AB v4.9.
(INDEX OF) THE VALUE OF THE VARIABLE
VARIABLE < LIMIT