1) How many error tables are there in fload and what are their significance/use? Can we see the data of error tables? How many error tables are their in mload and what is there use? When mload job fails, can we access mload t ables? If yes then how?
Fload uses 2 error tables Error table 1: where format of data is not correct. Error table 2: violations of UPI Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table 1. ET TABLE - Data error MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import task. 2. UV TABLE - UPI violations MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or delete task 3. WORK TABLE - WT Mload loads the selected records in the work table 4. LOG TABLE A log table maintains record of all checkpoints related to the load job, it is essential/madatory to specify a log table in mload job. This table will be useful in case you have a job abort or restart due to any reason.
Teradata?Explain with an appropriate example? 2) What are set tables and multiset tables in Teradata?Explain Set table does not allow dupicate value s. Multiset table allows duplicates.
3) Teradata Optimization and Performance Performance Tuning Optimization is the technique of selecting the least expensive plan (fastest plan) for the query to fetch results. Optimization is directly proportional to the availibility of -1. CPU resources 2. Systems resources - amps PEs etc.
Teradata performance tuning is a technique of improving the process in order for query to perform faster with the minimal use of CPU resources.
4) How many types errors will be occur in total SPOOL process . How will you connect a database server to other server? We can connect from one server to another server in UNIX using the command ssh or FTP or SU
5) Join Join Stratag Stratagies ies There are 2 tables, table A with 10 million records, table B has 100 million records, now we are joining both tables, when we seen Explain Plan t he plan showing TD will took the table A and it will redistributes it Now the Question is: By that plan is the optimizer is correct job or not ? Justify Ans
2. From the same above example now t he optimizer is taking Table B (100 million records) and it is distributing it, Now is the optimizer is doing best? and How you avoid this situation Teradata is smart enough to decide when to redistribute and when to copy.... It compares the tables. Are the y comparable? or one is big as compared to the other? Based on simple logic it decides whether to distribute the smaller table on all the AMPs or to copy....what I mean is the small table is copied into all the AMPs in the SPOOL space...Remember all always the JOINs has to take place on the AMPs SPOOL Space...By redistributing it is making sure that the 100 million rows table gets the feeling that it is making AMP local JOIN... Remember the basic thing what ever Teradata does...it does keeping in consideration for Space and Performance and not to forget the Effiiciency... My simple formula: If the table is small redistribute them to all the AMPs to have the AMP local Join. Always JOINs are made AMP local if it cannot then you have the high chance of running out of SPOOL space.
6) what is explain in teradata? The EXPLAIN facility is a teradata extension that provides you with an "ENGLISH" translation of the steps choosen by the optimizer to execute an SQL statement.It may be used oin any valid teradata database with a preface called "EXPLAIN". The following is an example:EXPLAIN select last_name first_name FROM employees;
The EXPLAIN parses the SQL statement but does not execute it. This provides the designer with an "execution stratergy". The execution stratergy provides what an optimizer does but not why it choses them. The EXPLAIN facility is used to analyze all joins and complex queries. 7)
What is the difference between Global temporary tables and Volatiletemporary tables? Global Temporary tables (GTT) 1. When they are created, its definition goes into Data Dictionary. 2. When materialized data goes in temp space. 3. thats why, data is active upto the session ends, and definition will remain there upto its not dropped using Drop table statement. If dropped from some other session then its should be Drop table all; 4. you can collect stats on GTT. Volatile Temporary tables (VTT) 1. Table Definition is stored in System cache 2. Data is stored in spool space. 3. thats why, data and table definition both are active only upto session ends. 4. No collect stats for VTT.
8) How teradata teradata makes makes sure sure that there there are no duplica duplicate te rows being being inserted inserted when when its a SET table?
Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate. If its a duplicate it silently skips it without throwing any error.
9) After creating creating tables dynamically in the Teardata, where is is the GRANT table option usually usually done ? When tables are newly created, what is t he default role and what the default privileges which get assigned? The GRANT option for any particular table depends on the privilages of the user. If it is an admin user you can grant privilages at any point of time. The deafult roles associated with the newly created tables depend on he schema in which they are created.
10) What is cliques? What is Vdisk and how it will communicate
with physicaldata storage at the time of data retrieval through AMP ? A clique is a set of Teradata nodes that share a common set of disk arrays. Cabling a subset of nodes to the same disk arrays creates a clique. Each AMP vproc must have access to an array controller which in turn accesses the physical disks. AMP vprocs are associated with one or more ranks (or mirrored pairs) of data. The total disk space associated with an AMP is called a vdisk. A vdisk may have up to three ranks. Hence Vdisk will communicate with physical storage through array controllers
VDisk provides the protection against the disk failure Node provides the protection against the AMP failure Clique provides the protection against the Node failure All the Disks(Dsik
11) How do Indexes optimize the query performance? Indexing is a way to physically reorganise the records to enable some frequently used queries to run faster. The index can be used as a pointer to the large table. It helps to locate the required row quickly and then return ot back to the user. or The frequesntly used queries need not hit a large table for data. they can get what they want from the indexitself. - cover queries. Index comes with the overhead of maintanance. Teradata maintains its index by itself. Each time an insert/update/delete is done on the table the indexes will also need to be updated and maintained. Indexes cannot be accessed directly by users. Only the optimizer has access to the index.
12)
What is a common data source for the central enterprise data warehouse? operational data stores 13) What is the difference difference between Multiload Multiload & Fastload interms of Performance? Performance? Answer-1: If you want to load, empty table then you use the fastload, so it will very usefull than the mutiload..because fastload performs the loading of the data in 2phase..and it noneed a work table for loading the data.., so it is faster as well as it follows the below steps to load the data in the table Phase1-It moves all the records to all the AMP first without any hashing Phase2-After giving endloading command, Amp will hashes the record and send it to the appropriate AMPS . Multiload: It does the loading in the 5 phases Phase1:It will get the import file and checks the script Phase2:It reads the record from the base table and store in the work table Phase3:In this Application phase it locks the table header Phase4:In the DML opreation will done in the tables Phase 5: In this table locks will be released and work tables will be dropped
14)Teradata performance tuning and optimization collecting statistics Explain Statements Avoid Product Joins when possible select appropriate primary index to avoid skewness in storage Avoid Redistributions when possible Use sub-selects instead of big "IN" lists Use drived tables Use GROUP BY instead of DISTINCT ( GROUP BY sorts the data locally on the VPROC. DISTINCT sorts the data after it is redistributed) Use Compression on large tables
15) Why MLOAD needs Work Tables? Work Tables are used to receive and sort data and SQL on each AMP prior to storing them permanently to disk.The purpose of worktables is to hold two things:1. The Data Manipulation Language (DML) tasks 2. The input data that is ready to APPLY to the AMPs MultiLoad will automatically create one worktable for each target table. This means
that in IMPORT mode you could have one or more worktables. In the DELETE mode, you will only have one worktable since that mode only works on one target table.
16) Write a single SQL to delete duplicate records from the a single table
based on a column value. I need only Unique records at the end of the Query Nested query method might be required in other databases how ever in TD we don’t need to follow such a difficult way to just find out the unique rows. In TD we have functions like Rank () and Rownum() in the combination of Qualify, helps you to select out the rows which you wants to delete.you can add a condition like ‘Where Rank() > 1’ 17) Why Fload doesn’t support multiset table? restart logic is the reason that FastLoad will not load duplicate rows into a MULTISET table. after you restart the fast load job, Therefore, some number of rows will be sent to the AMPs again because the restart starts on the next record after the value stored in the checkpoint. Hence, when a restart occurs, the first row after the checkpoint and some of the consecutive rows are sent a second time. These will be caught as duplicate rows after the sort. This restart logic is the reason that FastLoad will not load duplicate rows into a MULTISET table. It assumes they are duplicates because of this logic. 18) Why
MultiLoad Utility supports only Non Unique Secondary Index(NUSI) in the Target Table ?
Like FastLoad, MultiLoad does not support Unique Secondary Indexes (USIs). But unlike FastLoad, it does support the use of Non-Unique Secondary Indexes (NUSIs) because the index subtable row is on the same AMP as the data row. MultiLoad uses every AMP independently and in parallel. If two AMPs must communicate, they are not independent. Therefore, a NUSI (same AMP) is fine, but a USI (different AMP) is not. 19)We can find the information of all the indexes in the system table "dbc.indices" 20)Can
we load a Multi set table using MLOAD?
We can Load SET, MULTISET tables using Mload, But here when loading into MULTISET table using MLOAD duplicate rows will not be rejected, we have to take care of them before loading. But incase of Fload when we are loading into MULTISET duplicate rows are automatically rejected, FLOAD will not load duplicate rows weather table is SET or MULTISET 21) Types of Tables in Teradata : 1.Derived 2.Volatile 3.Global Temp. 4.Permanent 4.1. SET
4.2. Multiset
22) What
is FILLER command in Teradata?
while using the mload of fastload if you don;t want to load a particular filed in the datafile to the target then use this filler command to achieve this
23) Restart Restart Multi Multiload load If the data / data structure of the table is changed then drop the worktables, error tables and log tables and release mload from the table in which it is supposed to insert values. If mload fails due to any other error then simply restart it after fixing that error and it will resume from the check point. 24)can
I use “drop” statement in the utility “fload”?
YES, But you have to declare it out of the FLOAD Block it means it should not come between .begin loading,.end loading FLOAD also supports DELETE,CREATE,DROP statements which we have to declare out of FLOAD block in the FLOAD Block we can give only INSERT
25) fast load 4 tables any loader utility will have 2 tables log and target table for mload and fload we have ERROR TABLE(ET) AND UV(unique value table) errors related to unique value will be loaded into uv table data conversion errors will be loaded into ET table IN MLOAD , there are 5 tables work table to load the data from the source 5 phases 1) preliminary phase- all mload commands and sql syntaxes are checked .sessions are defines and support tables are created 2) DML transaction phase: sql is being to PE. sql plans are built 3) acquisition phase: date is being captured and data row is assigned to each row based on ha shing 4) application : date is being sorted 5) clean up phase: cleaning all the logs and closing the sessions. sessions.
What is explain and how does it work? Answer-1: The EXPLAIN facility is a teradata extension that provides you with an "ENGLISH" translation of the steps choosen
by the optimizer to execute an SQL statement.It may be used oin any valid teradata database with a preface called "EXPLAIN". The following is an example:EXPLAIN select last_name,first_name FROM employees; The EXPLAIN parses the SQL statement but does not execute it. This provides the designer with an "execution stratergy". The execution stratergy provides what an optimizer does but not why it choses them. The EXPLAIN facility is used to analyze all joins and complex queries.
What is an optimization and performance tuning and how does it really work in practical projects? Answer-1: Performance tuning and optimization of a query involves collecting statistics on join columns, avoiding cross product join, selection of appropriate primary index (to avoid skewness in storage) and using secondary index. Avoiding NUSI is advisable.
What is the difference between Global temporary tables and Volatile temporary tables? Global Temporary tables (GTT) 1. When they are created, its definition goes into Data Dictionary. 2. When materialized data goes in temp space. 3. thats why, data is active upto the session ends, and definition will remain there upto its not dropped using Drop table statement. If dropped from some other session then its should be Drop table all; 4. you can collect stats on GTT. Volatile Temporary tables (VTT) 1. Table Definition is stored in System cache 2. Data is stored in spool space. 3. thats why, data and table definition both are active only upto session ends. 4. No collect stats for VTT.
How teradata makes sure that there are no duplicate rows being inserted when its a SET table?
Answer-1: Teradata will redirect the new inserted row as per its PI to the target AMP (on the basis of its row hash value), and if it find same row hash value in that AMP (hash synonyms) then it start comparing the whole row, and find out if duplicate. If its a duplicate it silently skips it without throwing any error.
Fload, Mload and error tables: [How many error tables are there in fload and what are their significance/use? Can we see the data of error tables? How many error tables are their in mload and what is there use? When mload job fails, can we access mload tables? If yes then how?] Answer-1: load uses 2 error tables Error table 1: where format of data is not correct.
Error table 2: violations of UPI Mload also uses 2 error tables (ET and UV), 1 work table and 1 log table 1. ET TABLE - Data error MultiLoad uses the ET table, also called the Acquisition Phase error table, to store data errors found during the acquisition phase of a MultiLoad import task. 2. UV TABLE - UPI violations MultiLoad uses the UV table, also called the Application Phase error table, to store data errors found during the application phase of a MultiLoad import or delete task 3. WORK TABLE - WT Mload loads the selected records in the work table 4. LOG TABLE
A log table maintains record of all checkpoints related to the load job, it is essential/madatory to specify a log table in mload job. This table will be useful in case you have a job abort or restart due to any reason.
What are the enhanced features in Teradata V2R5 and V2R6?
V2R6 included the feature of replica in it.in which copy of da of data ta ba base se are available on another system.meam V2R6 provide the additional data protaction as comprison to V2R5 while if data from one s ystem has been vanishes.
After creating tables dynamically in the Teardata, where is the GRANT table option usually done ? When tables are newly created, what is the default role and
What the default privileges which get assigned ? Answer-1: The GRANT option for any particular table depends on the privilages of the user. If it is an admin user you can grant privilages at any point of time. The deafult roles associated with the newly created tables depend on he schema in which they are created.
What is cliques? What is Vdisk and how it will communicate with physical data storage at the time of data retrieval through AMP ? Answer-1: A clique is a set of Teradata nodes that share a common set of dis of disk k arr arrays ays.. Cabling a subset of nodes to the same disk arrays creates a clique. Each AMP vproc must have access to an array controller, which in turn accesses the physical disks. AMP vprocs are associated with one or more ranks (or mirrored pairs) of data. The total dis disk k spa space ce associated with an AMP is called a vdisk. A vdisk may have up to three ranks. Hence Vdisk will communicate with physical storage through array controllers.
What is basic teradata query language?
Answer-1: BTEQ(Basic teradata query) It allows us to write SQL statements along with BTEQ commands. We can us e BTEQ for importing,exporting and reporting purposes. The commands start with a (.) dot and can be terminated by using (;), it is not mandatory to use (;). SQL statements doesnt start with a dot , but (;) is compulsary to terminate the SQL statement. BTEQ will assume any thing written with out a dot as a sql statement and requires a (;) to terminate it.
How many codd's rules are satisfied by teradata database? Answer-1: There are 12 codd's rules applied to the teradata database
What is the difference between Multiload & Fastload interms of Performance? Answer-1: If you want to load, empty table then you use the fastload, so it will very usefull than the mutiload..because fastload performs the loading of the data in 2phase..and it noneed a work table for loading the data.., so it is faster as well as it follows the below steps to load the data in the table Phase1-It moves all the records to all the AMP first without any hashing Phase2-After giving endloading command, Amp will hashes the record and send it to the appropriate AMPS . Multiload: It does the loading in the 5 phases Phase1:It will get the import file and checks the scr ipt Phase2:It reads the record from the base table and store in the work table Phase3:In this Application phase it locks the table header Phase4:In the DML opreation will done in the tables Phase 5: In this table locks will be released and work tables will be dropped.
Does SDLC changes when you use Teradata instead of Oracle?
Answer-1: If the teradata is going to be only a data base means It won’t change the System development life cycle (SDLC)
If you are going to use the teradata utilities then it will change the Architecture or SDLC If your schema is going to be in 3NF then there won’t be huge in change
Which two statements are true about a foreign key? Each Foreign Key must exist as a Primary Key. Foreign Keys can change values over time. Answer-1: first : True second : False 1. Foreign Keys can change values over time. 2. Each Foreign Key must exist as a Primary Key.
What are two examples of an OLTP environment? # Tran Transactions sactions take a matter of seconds seconds or less. # Many transactions transactions involve a small amount of of data.
Answer-1: On Line Banking On Line Reservation (Transportation like Rail, Air etc.) Answer-2: 1- ATM 2- POS Answer-3: OLTP is typified by a small number of rows (or records) or a few of many possible tables being accessed in a matter of seconds or less. Very little I/O processing is required to complete the transaction. For eg. 1. This type of transaction takes place when we take out money at an ATM. Once our card is validated, a debit transaction takes place against our current balance to reflect the amount of cash withdrawn. 2. This type of transaction also takes place when we deposit money into a c hecking account and the balance gets updated. We expect these transactions to be performed quickly. They must occur in real time.