Database (DBMS)
Concepts:
Questions : 1 What is database or database management
systems (DBMS)? and - What’s the difference between file and database? Can
files qualify as a database?
Answers : 1
Database provides a systematic and organized way of storing,
managing and retrieving from collection of logically related information.
Secondly the information has to be persistent, that means
even after the application is closed the information should be persisted.
Finally it should provide an independent way of accessing
data and should not be dependent on the application to access the information.
Main difference between a simple file and database that
database has independent way (SQL) of accessing information while simple files
do not File meets the storing, managing and retrieving part of a database but
not the independent way of accessing data. Many experienced programmers think
that the main difference is that file can not provide multi-user capabilities
which a DBMS provides. But if we look at some old COBOL and C programs where
file where the only means of storing data, we can see functionalities like
locking, multi-user etc provided very efficiently. So it’s a matter of debate
if some interviewers think this as a main difference between files and database
accept it… going in to debate is probably loosing a job.
Questions : 2 What is SQL ?
Answers : 2
SQL stands for Structured Query Language.SQL is an ANSI
(American National Standards Institute) standard computer language for
accessing and manipulating database systems. SQL statements are used to
retrieve and update data in a database.
Questions : 3 What’s difference between DBMS and RDBMS ?
Answers : 3
DBMS provides a systematic and organized way of storing,
managing and retrieving from collection of logically related information. RDBMS
also provides what DBMS provides but above that it provides relationship
integrity. So in short we can say
RDBMS = DBMS + REFERENTIAL INTEGRITY
These relations are defined by using “Foreign Keys” in any
RDBMS.Many DBMS companies claimed there DBMS product was a RDBMS compliant, but
according to industry rules and regulations if the DBMS fulfills the twelve
CODD rules it’s truly a RDBMS. Almost all DBMS (SQL SERVER, ORACLE etc)
fulfills all the twelve CODD rules and are considered as truly RDBMS.
Questions : 4 What are CODD rules?
Answers : 4
In 1969 Dr. E. F. Codd laid down some 12 rules which a DBMS
should adhere in order to get the logo of a true RDBMS.
Rule 1: Information
Rule.
"All information in a relational data base is
represented explicitly at the logical level and in exactly one way - by values
in tables."
Rule 2: Guaranteed
access Rule.
"Each and every datum (atomic value) in a relational
data base is guaranteed to be logically accessible by resorting to a
combination of table name, primary key value and column name."
In flat files we have to parse and know exact location of
field values. But if a DBMS is truly RDBMS you can access the value by
specifying the table name, field name, for instance Customers.Fields [‘Customer
Name’].
Rule 3: Systematic
treatment of null values.
"Null values (distinct from the empty character string
or a string of blank characters and distinct from zero or any other number) are
supported in fully relational DBMS for representing missing information and
inapplicable information in a systematic way, independent of data type.".
Rule 4: Dynamic on-line
catalog based on the relational model.
"The data base description is represented at the logical
level in the same way as ordinary data, so that authorized users can apply the
same relational language to its interrogation as they apply to the regular
data."The Data Dictionary is held within the RDBMS, thus there is no-need
for off-line volumes to tell you the structure of the database.
Rule 5: Comprehensive
data sub-language Rule.
"A relational system may support several languages and
various modes of terminal use (for example, the fill-in-the-blanks mode).
However, there must be at least one language whose statements are expressible,
per some well-defined syntax, as character strings and that is comprehensive in
supporting all the following items
Data Definition
View Definition
Data Manipulation (Interactive and by program).
Integrity Constraints
Authorization.
Transaction boundaries ( Begin , commit and rollback)
Rule 6: .View updating Rule
"All views that are theoretically updatable are also
updatable by the system."
Rule 7: High-level
insert, update and delete.
"The capability of handling a base relation or a derived
relation as a single operand applies not only to the retrieval of data but also
to the insertion, update and deletion of data."
Rule 8: Physical data independence.
"Application programs and terminal activities remain
logically unimpaired whenever any changes are made in either storage
representations or access methods."
Rule 9: Logical data
independence.
"Application programs and terminal activities remain
logically unimpaired when information-preserving changes of any kind that
theoretically permit un-impairment are made to the base tables."
Rule 10: Integrity
independence.
"Integrity constraints specific to a particular
relational data base must be definable in the relational data sub-language and
storable in the catalog, not in the application programs." Rule 11:
Distribution independence.
"A relational DBMS has distribution independence."
Rule 12: Non-subversion
Rule.
"If a relational system has a low-level
(single-record-at-a-time) language, that low level cannot be used to subvert or
bypass the integrity Rules and constraints expressed in the higher level
relational language (multiple-records-at-a-time)."
Questions : 5 What are E-R diagrams?
Answers : 5
E-R diagram also termed as Entity-Relationship diagram shows
relationship between various tables in the database. .
Questions : 6 How many types of relationship exist in
database designing?
Answers : 6
There are three major relationship models:-
One-to-one
One-to-many
Many-to-many
Questions : 7 7.What is normalization? What are
different type of normalization?
Answers : 7
There is set of rules that has been established to aid in the
design of tables that are meant to be connected through relationships. This set
of rules is known as Normalization.
Benefits of Normalizing your database include:
=>Avoiding repetitive entries
=>Reducing required storage space
=>Preventing the need to restructure existing tables to
accommodate new data.
=>Increased speed and flexibility of queries, sorts, and
summaries.
Following are the three normal forms :-
First Normal Form
For a table to be in first normal form, data must be broken
up into the smallest un possible.In addition to breaking data up into the
smallest meaningful values, tables first normal form should not contain
repetitions groups of fields.
Second Normal form
The second normal form states that each field in a multiple
field primary keytable must be directly related to the entire primary key. Or
in other words,each non-key field should be a fact about all the fields in the
primary key.
Third normal form
A non-key field should not depend on other Non-key field.
Questions : 8 What is denormalization ?
Answers : 8
Denormalization is the process of putting one fact in
numerous places (its vice-versa of normalization).Only one valid reason exists
for denormalizing a relational design - to enhance performance.The sacrifice to
performance is that you increase redundancy in database.
Questions : 9 Can you explain Fourth Normal Form and
Fifth Normal Form ?
Answers : 9
In fourth normal form it should not contain two or more
independent multi-v about an entity and it should satisfy “Third Normal form”.
Fifth normal form deals with reconstructing information from
smaller pieces of information. These smaller pieces of information can be
maintained with less redundancy.
Questions : 10 Have you heard about sixth normal form?
Answers : 10
If we want relational system in conjunction with time we use
sixth normal form. At this moment SQL Server does not supports it directly.
Questions : 11 What are DML and DDL statements?
Answers : 11
DML stands for Data Manipulation Statements. They update data
values in table. Below are the most important DDL statements:-
=>SELECT - gets data from a database table
=> UPDATE - updates data in a table
=> DELETE - deletes data from a database table
=> INSERT INTO - inserts new data into a database table
DDL stands for Data definition Language. They change
structure of the database objects like table, index etc. Most important DDL
statements are as shown below:-
=>CREATE TABLE - creates a new table in the database.
=>ALTER TABLE – changes table structure in database.
=>DROP TABLE - deletes a table from database
=> CREATE INDEX - creates an index
=> DROP INDEX - deletes an index
Questions : 12 How do we select distinct values from a
table?
Answers : 12
DISTINCT keyword is used to return only distinct values.
Below is syntax:- Column age and Table pcdsEmp
SELECT DISTINCT age FROM pcdsEmp
Questions : 13 What is Like operator for and what are wild
cards?
Answers : 13
LIKE operator is used to match patterns. A "%" sign
is used to define the pattern.
Below SQL statement will return all words with letter
"S"
SELECT * FROM pcdsEmployee WHERE EmpName LIKE 'S%'
Below SQL statement will return all words which end with
letter "S"
SELECT * FROM pcdsEmployee WHERE EmpName LIKE '%S'
Below SQL statement will return all words having letter
"S" in between
SELECT * FROM pcdsEmployee WHERE EmpName LIKE '%S%'
"_" operator (we can read as “Underscore
Operator”). “_” operator is the character defined at that point. In the below
sample fired a query Select name from pcdsEmployee where name like '_s%' So all
name where second letter is “s” is returned.
Questions : 14 Can you explain Insert, Update and Delete
query?
Answers : 14
Insert statement is used to insert new rows in to table.
Update to update existing data in the table. Delete statement to delete a
record from the table. Below code snippet for Insert, Update and Delete :-
INSERT INTO pcdsEmployee SET name='rohit',age='24';
UPDATE pcdsEmployee SET age='25' where name='rohit';
DELETE FROM pcdsEmployee WHERE name = 'sonia';
Questions : 15 What is order by clause?
Answers : 15
ORDER BY clause helps to sort the data in either ascending
order to descending order.
Ascending order sort query
SELECT name,age FROM pcdsEmployee ORDER BY age ASC
Descending order sort query
SELECT name FROM pcdsEmployee ORDER BY age DESC
Questions : 16 What is the SQL " IN " clause?
Answers : 16
SQL IN operator is used to see if the value exists in a group
of values. For instance the below SQL checks if the Name is either 'rohit' or
'Anuradha' SELECT * FROM pcdsEmployee WHERE name IN ('Rohit','Anuradha') Also
you can specify a not clause with the same. SELECT * FROM pcdsEmployee WHERE
age NOT IN (17,16)
Questions : 17 Can you explain the between clause?
Answers : 17 Below SQL selects employees born between '01/01/1975' AND
'01/01/1978' as per mysql
SELECT * FROM pcdsEmployee WHERE DOB BETWEEN '1975-01-01' AND
'2011-09-28'
Questions : 18 we have an employee salary table how do we
find the second highest from it?
Answers : 18
below Sql Query find the second highest salary
SELECT * FROM pcdsEmployeeSalary a WHERE (2=(SELECT
COUNT(DISTINCT(b.salary)) FROM pcdsEmployeeSalary b WHERE
b.salary>=a.salary))
Questions : 19 What are different types of joins in SQL?
Answers : 19
INNER JOIN
Inner join shows matches only when they exist in both tables.
Example in the below SQL there are two tables Customers and Orders and the
inner join in made on Customers.Customerid and Orders.Customerid. So this SQL
will only give you result with customers who have orders. If the customer does
not have order it will not display that record.
SELECT Customers.*, Orders.* FROM Customers INNER JOIN Orders
ON Customers.CustomerID =Orders.CustomerID
LEFT OUTER JOIN
Left join will display all records in left table of the SQL
statement. In SQL below customers with or without orders will be displayed.
Order data for customers without orders appears as NULL values. For example,
you want to determine the amount ordered by each customer and you need to see
who has not ordered anything as well. You can also see the LEFT OUTER JOIN as a
mirror image of the RIGHT OUTER JOIN (Is covered in the next section) if you
switch the side of each table.
SELECT Customers.*, Orders.* FROM Customers LEFT OUTER JOIN
Orders ON Customers.CustomerID =Orders.CustomerID
RIGHT OUTER JOIN
Right join will display all records in right table of the SQL
statement. In SQL below all orders with or without matching customer records
will be displayed. Customer data for orders without customers appears as NULL
values. For example, you want to determine if there are any orders in the data
with undefined CustomerID values (say, after a conversion or something like
it). You can also see the RIGHT OUTER JOIN as a mirror image of the LEFT OUTER
JOIN if you switch the side of each table.
SELECT Customers.*, Orders.* FROM Customers RIGHT OUTER JOIN
Orders ON Customers.CustomerID =Orders.CustomerID
Questions : 20 What is “CROSS JOIN”? or What is Cartesian
product?
Answers : 20
“CROSS JOIN” or “CARTESIAN PRODUCT” combines all rows from
both tables. Number of rows will be product of the number of rows in each
table. In real life scenario I can not imagine where we will want to use a
Cartesian product. But there are scenarios where we would like permutation and
combination probably Cartesian would be the easiest way to achieve it.
Questions : 21 How to select the first record in a given
set of rows?
Answers : 21
Select top 1 * from sales.salesperson
Questions : 22 What is the default “-SORT ” order for a
SQL?
Answers : 22
ASCENDING
Questions : 23 What is a self-join?
Answers : 23
If we want to join two instances of the same table we can use
self-join.
Questions : 24 What’s the difference between DELETE and
TRUNCATE ?
Answers : 24
Following are difference between them:
=>>DELETE TABLE syntax logs the deletes thus making the
delete operations low. TRUNCATE table does not log any information but it logs
information about deallocation of data page of the table. So TRUNCATE table is
faster as compared to delete table.
=>>DELETE table can have criteria while TRUNCATE can
not.
=>> TRUNCATE table can not have triggers.
Questions : 25 What’s the difference between “UNION” and
“UNION ALL” ?
Answers : 25
UNION SQL syntax is used to select information from two
tables. But it selects only distinct records from both the table. , while UNION
ALL selects all records from both the tables.
Questions : 26 What are cursors and what are the situations
you will use them?
Answers : 26
SQL statements are good for set at a time operation. So it is
good at handling set of data. But there are scenarios where we want to update
row depending on certain criteria. we will loop through all rows and update
data accordingly. There’s where cursors come in to picture.
Questions : 27 What is " Group by " clause?
Answers : 27
“Group by” clause group similar data so that aggregate values
can be derived.
Questions : 28 What is the difference between “HAVING” and
“WHERE” clause?
Answers : 28
“HAVING” clause is used to specify filtering criteria for
“GROUP BY”, while “WHERE” clause applies on normal SQL.
Questions : 29 What is a Sub-Query?
Answers : 29
A query nested inside a SELECT statement is known as a
subquery and is an alternative to complex join statements. A subquery combines
data from multiple tables and returns results that are inserted into the WHERE
condition of the main query. A subquery is always enclosed within parentheses
and returns a column. A subquery can also be referred to as an inner query and
the main query as an outer query. JOIN gives better performance than a subquery
when you have to check for the existence of records.
For example, to retrieve all EmployeeID and CustomerID
records from the ORDERS table that have the EmployeeID greater than the average
of the EmployeeID field, you can create a nested query, as shown:
SELECT DISTINCT EmployeeID, CustomerID FROM ORDERS WHERE
EmployeeID > (SELECT AVG(EmployeeID) FROM ORDERS)
Questions : 30 What are Aggregate and Scalar Functions?
Answers : 30
Aggregate and Scalar functions are in built function for
counting and calculations.
Aggregate functions operate against a group of values but
returns only one value.
AVG(column) :- Returns the average value of a column
COUNT(column) :- Returns the number of rows (without a NULL
value) of a column
COUNT(*) :- Returns the number of selected rows
MAX(column) :- Returns the highest value of a column
MIN(column) :- Returns the lowest value of a column
Scalar functions operate against a single value and return
value on basis of the single value.
UCASE(c) :- Converts a field to upper case
LCASE(c) :- Converts a field to lower case
MID(c,start[,end]) :- Extract characters from a text field
LEN(c) :- Returns the length of a text
Questions : 31 Can you explain the SELECT INTO Statement?
Answers : 31
SELECT INTO statement is used mostly to create backups. The
below SQL backsup the Employee table in to the EmployeeBackUp table. One point
to be noted is that the structure of pcdsEmployeeBackup and pcdsEmployee table
should be same. SELECT * INTO pcdsEmployeeBackup FROM pcdsEmployee
Questions : 32 What is a View?
Answers : 32
View is a virtual table which is created on the basis of the
result set returned by the select statement.
CREATE VIEW [MyView] AS SELECT * from pcdsEmployee where
LastName = 'singh'
In order to query the view
SELECT * FROM [MyView]
Questions : 33 What is SQl injection ?
Answers : 33
It is a Form of attack on a database-driven Web site in which
the attacker executes unauthorized SQL commands by taking advantage of insecure
code on a system connected to the Internet, bypassing the firewall. SQL
injection attacks are used to steal information from a database from which the
data would normally not be available and/or to gain access to an organization’s
host computers through the computer that is hosting the database.
SQL injection attacks typically are easy to avoid by ensuring
that a system has strong input validation.
As name suggest we inject SQL which can be relatively
dangerous for the database. Example this is a simple SQL
SELECT email, passwd, login_id, full_name
FROM members WHERE email = 'x'
Now somebody does not put “x” as the input but puts “x ; DROP
TABLE members;”.
So the actual SQL which will execute is :-
SELECT email, passwd, login_id, full_name FROM members WHERE
email = 'x' ; DROP TABLE members;
Think what will happen to your database.
Questions : 34 What is Data Warehousing ?
Answers : 34
Data Warehousing is a process in which the data is stored and
accessed from central location and is meant to support some strategic
decisions. Data Warehousing is not a requirement for Data mining. But just
makes your Data mining process more efficient.
Data warehouse is a collection of integrated,
subject-oriented databases designed to support the decision-support functions
(DSF), where each unit of data is relevant to some moment in time.
Questions : 35 What are Data Marts?
Answers : 35
Data Marts are smaller section of Data Warehouses. They help
data warehouses collect data. For example your company has lot of branches
which are spanned across the globe. Head-office of the company decides to
collect data from all these branches for anticipating market. So to achieve
this IT department can setup data mart in all branch offices and a central data
warehouse where all data will finally reside.
Questions : 36 What are Fact tables and Dimension Tables ?
What is Dimensional Modeling and Star Schema Design
Answers : 36
When we design transactional database we always think in
terms of normalizing design to its least form. But when it comes to designing
for Data warehouse we think more in terms of denormalizing the database. Data
warehousing databases are designed using Dimensional Modeling. Dimensional
Modeling uses the existing relational database structure and builds on that.
There are two basic tables in dimensional modeling:-
Fact Tables.
Dimension Tables.
Fact tables are central tables in data warehousing. Fact
tables have the actual aggregate values which will be needed in a business
process. While dimension tables revolve around fact tables. They describe the
attributes of the fact tables.
Questions : 37 What is Snow Flake Schema design in
database? What’s the difference between Star and Snow flake schema?
Answers : 37
Star schema is good when you do not have big tables in data
warehousing. But when tables start becoming really huge it is better to
denormalize. When you denormalize star schema it is nothing but snow flake design.
For instance below customeraddress table is been normalized and is a child
table of Customer table. Same holds true for Salesperson table.
Questions : 38 What is ETL process in Data warehousing?
What are the different stages in “Data warehousing”?
Answers : 38
ETL (Extraction, Transformation and Loading) are different
stages in Data warehousing. Like when we do software development we follow
different stages like requirement gathering, designing, coding and testing. In
the similar fashion we have for data warehousing.
Extraction:-
In this process we extract data from the source. In actual
scenarios data source can be in many forms EXCEL, ACCESS, Delimited text, CSV
(Comma Separated Files) etc. So extraction process handle’s the complexity of
understanding the data source and loading it in a structure of data warehouse.
Transformation:-
This process can also be called as cleaning up process. It’s
not necessary that after the extraction process data is clean and valid. For
instance all the financial figures have NULL values but you want it to be ZERO
for better analysis. So you can have some kind of stored procedure which runs
through all extracted records and sets the value to zero.
Loading:-
After transformation you are ready to load the information in
to your final data warehouse database.
Questions : 39 What is Data mining ?
Answers : 39
Data mining is a concept by which we can analyze the current
data from different perspectives and summarize the information in more useful
manner. It’s mostly used either to derive some valuable information from the
existing data or to predict sales to increase customer market.
There are two basic aims of Data mining:-
Prediction: -
From the given data we can focus on how the customer or
market will perform. For instance we are having a sale of 40000 $ per month in
India, if the same product is to be sold with a discount how much sales can the
company expect.
Summarization: -
To derive important information to analyze the current
business scenario. For example a weekly sales report will give a picture to the
top management how we are performing on a weekly basis?
Questions : 40 Compare Data mining and Data Warehousing ?
Answers : 40
“Data Warehousing” is technical process where we are making
our data centralized while “Data mining” is more of business activity which
will analyze how good your business is doing or predict how it will do in the
future coming times using the current data. As said before “Data Warehousing”
is not a need for “Data mining”. It’s good if you are doing “Data mining” on a
“Data Warehouse” rather than on an actual production database. “Data
Warehousing” is essential when we want to consolidate data from different
sources, so it’s like a cleaner and matured data which sits in between the
various data sources and brings then in to one format. “Data Warehouses” are
normally physical entities which are meant to improve accuracy of “Data mining”
process. For example you have 10 companies sending data in different format, so
you create one physical database for consolidating all the data from different
company sources, while “Data mining” can be a physical model or logical model.
You can create a database in “Data mining” which gives you reports of net sales
for this year for all companies. This need not be a physical database as such
but a simple query.
Questions : 41 What are indexes? What are B-Trees?
Answers : 41
Index makes your search faster. So defining indexes to your
database will make your search faster.Most of the indexing fundamentals use
“B-Tree” or “Balanced-Tree” principle. It’s not a principle that is something
is created by SQL Server or ORACLE but is a mathematical derived fundamental.In
order that “B-tree” fundamental work properly both of the sides should be
balanced.
Questions : 42 I have a table which has lot of inserts, is
it a good database design to create indexes on that table?
Insert’s are slower on
tables which have indexes, justify it?or Why do page splitting happen?
Answers : 42
All indexing fundamentals in database use “B-tree”
fundamental. Now whenever there is new data inserted or deleted the tree tries
to become unbalance.
Creates a new page to balance the tree.
Shuffle and move the data to pages.
So if your table is having heavy inserts that means it’s
transactional, then you can visualize the amount of splits it will be doing.
This will not only increase insert time but will also upset the end-user who is
sitting on the screen. So when you forecast that a table has lot of inserts
it’s not a good idea to create indexes.
Questions : 43 What are the two types of indexes and
explain them in detail? or What’s the difference between clustered and
non-clustered indexes?
Answers : 43
There are basically two types of indexes:-
Clustered Indexes.
Non-Clustered Indexes.
In clustered index the non-leaf level actually points to the
actual data.In Non-Clustered index the leaf nodes point to pointers (they are
rowid’s) which then point to actual data.
No comments:
Post a Comment