## Common Pitfalls in Sample Statistical Analysis and How to Avoid Them

Sample statistical analysis is a crucial step in any research project. It involves examining a subset of data to make inferences about the larger population. However, there are several common pitfalls that researchers often fall into during this process. In this article, we will discuss these pitfalls and provide tips on how to avoid them.

One of the most common pitfalls in sample statistical analysis is using an inadequate sample size. A small sample size can lead to unreliable results and limited generalizability. Researchers may be tempted to save time and resources by using a smaller sample, but this can compromise the validity of their findings.

To avoid this pitfall, it is important to determine an appropriate sample size before conducting the analysis. This can be done through power calculations or consulting statistical experts. By ensuring an adequate sample size, researchers can increase the reliability and accuracy of their results.

## Biased Sampling

Another pitfall in sample statistical analysis is biased sampling. Biased sampling occurs when the selection process favors certain individuals or groups over others, leading to skewed results that do not accurately represent the population.

To avoid biased sampling, researchers should strive for random sampling techniques whenever possible. Random sampling ensures that each member of the population has an equal chance of being included in the sample, minimizing selection bias. If random sampling is not feasible, researchers should clearly acknowledge any potential biases and discuss their implications on the interpretation of results.

## Failure to Consider Nonresponse Bias

Nonresponse bias refers to the potential bias introduced when individuals selected for a study do not respond or participate fully. This can occur when participants refuse to answer certain questions or drop out of the study altogether.

To address nonresponse bias, researchers should make efforts to maximize response rates and minimize missing data. Clear communication with participants about the importance of their responses and follow-up reminders can help improve response rates. Additionally, researchers can use statistical techniques such as imputation to handle missing data and reduce the impact of nonresponse bias on the results.

## Ignoring Assumptions of Statistical Tests

Statistical tests often have underlying assumptions that need to be met for the results to be valid. Ignoring these assumptions can lead to erroneous conclusions and misinterpretation of data.

To avoid this pitfall, researchers should carefully examine the assumptions of the statistical tests they plan to use. Common assumptions include normality, independence, and equal variances. If these assumptions are violated, researchers should consider alternative methods or transformations that are more appropriate for their data. It is crucial to acknowledge any deviations from these assumptions in the interpretation of results.

In conclusion, sample statistical analysis is a critical step in research projects, but it is not without its pitfalls. By avoiding inadequate sample sizes, biased sampling, nonresponse bias, and ignoring assumptions of statistical tests, researchers can enhance the validity and reliability of their findings. Taking these precautions will ensure that sample statistical analysis accurately represents the larger population and provides meaningful insights for decision-making purposes.

## DATA 275 Introduction to Data Analytics

• Getting Started with SPSS
• Variable View
• Option Suggestions
• SPSS Viewer
• Entering Data
• Cleaning & Checking Your SPSS Database
• Recoding Data: Collapsing Continuous Data
• Constructing Scales and Checking Their Reliability
• Formatting Tables in APA style
• Creating a syntax
• Public Data Sources

## Data Analytics Project Assignment

Literature Review

For your research project you will conduct data analysis and right a report summarizing your analysis and the findings from your analysis. You will accomplish this by completing a series of assignments.

Data 275 Research Project Assignment

In this week’s assignment, you are required accomplish the following tasks:

1. Propose a topic for you project

The topic you select for your capstone depends on your interest and the data problem you want to address. Try to pick a topic that you would enjoy researching and writing about.

Your topic selection will also be influenced by data availability. Because, this is a data analytics project, you will need to have access to data. If you have access to your organization’s data, you are free to use it. If you choose to do so, all information presented must be in secure form because Davenport University does not assume any responsibility for the security of corporate data. Otherwise, you can select a topic that is amenable to publicly available data.

Click the link for some useful suggestions: Project Proposal Suggestions

There are many publicly available data sets that you can use for your project. The library has compiled a list of many possible sources of data. Click on the link below to explore these sources.

Public Data Sources

The data set you select must have:

At least 50 observations (50 rows) and at least 4 variables (columns) excluding identification variables At least one dependent variable

You must provide:

A proper citation of the data source using APA style format A discussion on how the data was collected and by whom The number of variables in the data set The number of observations/subjects in the data set A description of each variable together with an explanation of how it is measured (e.g. the unit of measurement).

Deliverable

A minimum of one page description of your data analytics project which must include the following:

A title for your project A brief description of the project Major stakeholders who would use the information that would be generated from your analysis and how they would use/benefit from that information A description of the dataset you will use for your project

## Big Data Analytics Assignment

Task: Worldwide Influence of Big Data Analytics on the Business Priorities and Decision-making Big Data analytics has entirely transformed the approaches as well as modes of the recent business scenarios and this particular concept is simply comprised of four important attributes such as value, velocity, volume as well as variety (Chen, Chiang and Storey 2012). This particular area of research can also result in the useful insights that in turn would aid the better strategic decisions in relation to the businesses. The concept of the big data analytics has risen beyond the storage of numerous information and it has also made the use of the analytical methods iterative along with the ongoing trends of marketplaces in the world of the mobile applications (LaValle er al., 2011). For an instance, businesses in today’s world are capable of analyzing the information on an immediate basis along with the speed of in-memory and Hadoop analytics combined with the capability of analyzing the new data sources (Demirkan and Delen 2013). Therefore, at today’s date, the organizations all over the globe are significantly utilizing the Big Data in driving the decisions of businesses as well as enhancing and improvising the ROI and the performances of the businesses (Chen, Chiang and Storey 2012). Big Data Analytics is a widely accepted or considered topic in the course and the profession of the engineering management as it has been turned into an impressive innovation in the engineering ground because it deals with offering numerous new ways of several technology integration.

Important Research Question How does Big Data Analytics influence the business decision-making and business priorities as well? Independent Variable - Big Data Analytics Dependent Variables - Decision-making and Business Priorities

Clarity on the Question Big Data Analytics deals with helping the companies in harnessing their data as well as utilizing it for identifying the new scopes, which in turn can result in the smarter moves of businesses, happier consumers, higher profits as well as the more efficient business operations (LaValle er al., 2011). Thus, such capability of working faster and staying agile can five the companies a competitive edge that they did not have before (Chen, Chiang and Storey 2012). Therefore, in order to make successful the application of the Big Data Analytics in the business operations of companies, it is very important to analyze the question mentioned above.

Introduction: Big data analysis is a term that is applied to a set of data that is beyond the preview of traditional database. It is used to store data in bulk thereby making way for management of data in a systematic manner. Organizations operating should go for management of data that would be helpful for future projection and implementation. It is important to note that data storage is a cumbersome procedure so should be stored in a definite chamber. Big data provides the platform wherein storage is made easy and exquisite. In fact with the inclusion of big data storage development in the field of decision making can be seen as it helps in understanding the present scenario from the consolidated stock and helps in delivering in a spontaneous manner. In this data analysis assignment we have strictly followed the format given in marking rubrics to cover in helping the student to cover all the deliverables in the assignment. Using the format given below will help you in drafting the data analysis assignment in a descent way.

Aims and objectives: The aim of this data analysis assignment is to define the possibilities that can be undertaken in the storage of data thereby making way for production in bulk for the welfare of the organization. In fact with the inclusion of this concept initiation can be seen in the development of storage capacity. Understanding the requirements of the consumers is of utmost importance thereby delivering taking into account the possibilities of productive development is always at hands (Gandomi & Haider, 2015).

Objectives: The objectives of this data analysis assignment is to

• Provide the platform wherein bulk quantity of data can be stored thereby making way for introduction of more inputs
• Provide the platform wherein product kept at stock can be provided to usage t any point of time.
• Emphasize on team performance in building a coherent atmosphere in delivering output.
• Employment of personnel in keeping records in delivering and maintaining data in the most comprehensive manner

Research Question: How does Big Data help in decision making for the organization? How convenient it is for the organizational personnel in maintaining data in bulk at a single point of time without any difficulty? How the concept of Big Data would be useful in future possibilities for an organization?

Literature Review: Big data analytics is one of the most developed and advanced means in maintaining forum for management of resources. It would help in building momentum for the organization. It would help in delivering in the most comprehensive manner. It would help in redefining concepts of progressive development. It acts as the forum that would help in understanding tastes and the preferences of the consumer of the consumers thereby act in conformity to it. It is important to note development is possible when there is the amalgamation of top authorities along with the technological experts, administrative experts, quality control experts, administrative experts so and so forth. In fact with the amalgamation of different departments inclusion of productive results can be seen at the outset. Big data is driving the corporate world by storm. It has helps in breaking barriers thereby making way for exclusive performance for the growth and development of the organization at large (García et al., 2016).

Management has improved on a significant manner. It has led to outclass probabilities and thereby turn them into possibilities in the organizational forum. In fact with the inclusion of probabilities development can be seen in the functioning at a rapid pace. Analysing information of the organization has been helpful and convenient thereby making way for progressive development in the performance of the organization. Big Data is the cluster of numerous information that is helpful and conclusive in providing the platform for all round development of the organization in the long run. In fact, with the inclusion of probabilities deliberate attempts have been made by personnel to improve the quality of insight in the functioning of the organization (Assunção et al., 2015).

In the most comprehensive ground, there is scope for all round development once functioning is being undertaken for the betterment of the organization. Onset of probabilities along with possibilities can help in diversifying performance of the organization in the long run. Big data acts as the warehouse that would help in maintaining data that would be used by the organizations in the functioning of the organization in the long run. It is the platform of numerous technology integration. Decision making of the business is being developed and enhancement can be seen in the functioning of the organization .In fact with the adaptation of the decision making future prospects can be developed in the most comprehensive manner. It has comprehensively changed the concept of data storage thereby making way for all round development in the delivaration of performance of the organization. Big Data is one of the conclusive and the most promising aspect in the professional world (Rajaraman, 2016). It has led to conceptualize on the propositions related to the functioning of an organization. In fact with the advent of technology need of the hour has been towards productive development rather than inclined towards promulgation of uncertainties.

In the recent context development is possible through the application of resources that would be helpful in the functioning of an organizational forum. In fact with the inclusion of propositions regarding the functioning of an organization it is important on the part of the officials in deliberately inclined towards development that would help in building a platform that would be helpful in maintaining momentum for future growth of the organization. A definite framework that would help in maintaining, should structure, inclusion of swiftness on the part of the development is necessary in this regime (Hu et al., 2014). Moreover, application of development would help in igniting a forum that would be effective in the development of the organization in the long run.

Data analysis: Data analysis is one of the most important and decisive aspect in the functioning of an management forum. In fact with the inclusion of data analysis one can understand the dimensions of big data thereby can make way for all round development in the undertaking of the same (Hashem et al., 2015). In this context there would be undertaking of qualitative analysis that would help in understanding the pros and cons of the management scenario. Development is possible by the application of resources in the functioning of an organizational forum. In the context of Big Data one might question on the implementation of data and its necessity. Qualitative data has been one of the most important aspects in redefining the propositions of research. It helps in demonstrating the consumer insights thereby making way for understanding the propositions in the long run. Generating consumer insights is the tedious task and most importantly implicit on the part of the developers of management in delivering in the most comprehensive manner (Chen et al., 2014).

Online research helps in understanding the expressions and the opinions of the people that are being asked regarding the particular prospect in this regard. In fact with the application of data analysis it is important to note that development seeks a hike once deliberations are the need of the hour. In fact with the application of development it is necessary to review the propositions in the functioning of management of an organization. It is a challenging proposition on the part of the operators and the expert officials in delivering in the functioning of the management. Deliberate attempt can be made on the part of the execution of the policies of the management thereby making way for all round development for the organization. In fact, taking a note of the records that has been used in the functioning of an organizational forum. Development measures should be undertaken by the management that would help in delivering in the most comprehensive manner. It would lead to deliberations in the functioning of the management as a whole. It is important to note that going for records thereby understanding the scenario is important as would help in improving the quality and prospect of the project (John Walker, 2014).

Big Data is blessing in disguise. It is imperative on the part of the officials in delivering in the most comprehensive manner. In fact there has to be recording of minutes that would help in maintaining records in the functioning of the management as a whole. In order to incite development it is imperative on the part of the management to act in forum thereby making way for productive means that would help in decision making of the organization as a whole. It is important in this context to deliver according to the propositions thereby making way for absolute development in the long run .In order to inculcate this training and development is necessary on the part of the officials in making way for alternatives that would help in maintaining forum for the management of the organization. Collection of data has to be in the integration mode thereby making way for transition time and again. Ineffectiveness has to be eliminated thereby making way for outreach that would be helpful in making propositions for development in the significant manner. In order to incorporate development review management has to be undertaken by the management time and again in order to incorporate development in the long run.

Research can be undertaken by the application of online medium thereby making way for all round development. Analytical tools should be used that would help in transition in the functioning of an organizational forum. In fact with the application of automation in the functioning of an organizational forum it is important to note that development would led to development in the long run. Needs and wants seems to be increasing accordingly. It has led to maintaining big data that would help in meeting the needs of the consumers thereby making way for enhancement in the customer base of the organization. It helps in escalation in yielding positive results for the functioning of the management thereby making inroads for all round development of the organization time and again. Communication plays an important role in the up gradation in the functioning of the organization. In order to incorporate effectiveness in the functioning of the management there has to be incorporation in the functioning of the management making inroads for productive development on a simultaneous basis (Kambatla et al., 2014).

Transition is the need of the hour therefore should be incorporated with the inroads of corporate simulation. Corporate culture should be maintained during the course of proceedings of the management, which is possible in the functioning of the management. In order to incorporate development it is necessary to inculcate transition in the functioning of the management thereby making inroads for effective development in the functioning of the organization in the significant manner. In order to incorporate development it is necessary to go for effective implementation of policies thereby making way for effective results (Talia, 2013).

Gantt chart

Conclusion: From the above evaluation in this data analysis assignment, it can be ascertained that functioning of big data is important to reconcile with the functioning of the management. In order to inculcate this training and development is necessary on the part of the officials in making way for alternatives that would help in maintaining forum for the management of the organization. Collection of data has to be in the integration mode thereby making way for transition time and again. Ineffectiveness has to be eliminated thereby making way for outreach that would be helpful in making propositions for development in the significant manner. In order to incorporate development review management has to be undertaken by the management time and again in order to incorporate development.

Reference Assunção, M. D., Calheiros, R. N., Bianchi, S., Netto, M. A., & Buyya, R. (2015). Big Data computing and clouds: Trends and future directions. Journal of Parallel and Distributed Computing, 79, 3-15.

Chen, M., Mao, S., & Liu, Y. (2014). Big data: A survey. Mobile networks and applications, 19(2), 171-209.

Gandomi, A., & Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35(2), 137-144.

García, S., Ramírez-Gallego, S., Luengo, J., Benítez, J. M., & Herrera, F. (2016). Big data preprocessing: methods and prospects. Big Data Analytics, 1(1), 9.

Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 47, 98-115.

Hu, H., Wen, Y., Chua, T. S., & Li, X. (2014). Toward scalable systems for big data analytics: A technology tutorial. IEEE access, 2, 652-687.

John Walker, S. (2014). Big data: A revolution that will transform how we live, work, and think.

Kambatla, K., Kollias, G., Kumar, V., & Grama, A. (2014). Trends in big data analytics. Journal of Parallel and Distributed Computing, 74(7), 2561-2573.

Rajaraman, V. (2016). Big data analytics. Resonance, 21(8), 695-716.

Talia, D. (2013). Clouds for scalable big data analytics. Computer, 46(5), 98-101.

• Data Analysis Assignment #2_r7
## 1.1-Critically compare different data models and schemas.

Data Models tells us how to create logical design of database. There are three major types of data models:

## 1.2- Compare database systems with file based systems and also discuss the benefits and limitations of different database technologies with reference to above scenario

File systems and database management systems (DBMS) are two ways of managing data. DBMS is a computer based system to record persistent data.

• In database management system, data is stored in large databases, while in file system, data is stored in form of files.
• In file system, tasks like storage, retrieval, and search are done manually while in DBMS such tasks are automated, usually done by help of some tool.
• Manual process of executing tasks invites problems like data integrity, data inconsistency and data security while such problems can be avoided in DBMS.
• In DBMS certain control mechanisms are used and reading line by line is not required, whereas this is not the case with file systems.
• We can avoid unauthorized access to database in DBMS whereas in file systems this is not possible.
• With DBMS we can make use of backup and recovery, whereas in file systems, data once lost cannot be recovered back.
• In DBMS, multiple users can access the data at one time, whereas with file systems this is not possible. File systems are basically designed for single user to access it at a time.
• DBMS avoids data duplication. If a set of data is required by multiple web application , it is made available, whereas in file system this is not possible, one program is not readable by another program.

## 1.3- Analyze different approaches to database design.

Organizations these day’s follow four major techniques of designing databases. These are:

## 2.1 - Discuss the principles of normalization and steps followed to achieve normal forms

Database normalization is a technique to arrange, manage or organize the data (fields and tables) in a relational database. This process is followed to minimize redundancy and to ensure that data is logically stored. If our database is not normalized, it becomes difficult to maintain and update database without data loss. We might face three kinds of anomalies if database is not normalized. These are: Insertion Anomaly: Consider table Manager :

Suppose for table Manager, for a new entry we have manager_id, manager_name, manager_address, but the newly recruited person has not been assigned any department yet, still then we have to insert NULL there, leading to insertion anomaly. Deletion Anomaly: In above table, if for ‘01’ manager_id there is only one department assigned, but if he temporarily drops it, then we might have to delete entire record corresponding to it, rather than only one column. Updation Anomaly: To update the address of the manager, who occurs more than once in a table, we need to update address in all entries, else data will become inconsistent. E.g.: In above table, Manager_name = Neil occurs more than once, so manager_address entries needs to be changed two times in the table. Therefore, to remove such anomalies, we need to normalize our database tables. There are 4 major types of Normal Forms: First Normal Form (1NF): According to this rule, no two groups of data or records in table must contain same information. E.g.: Manager Table: (Before normalization)

Manager Table: (After Normalization)

After normalizing manager table, each row has unique data, but data redundancy increases. Second Normal Form (2NF): There must not be any partial dependencies, according to this rule. The concatenation must depend on primary key. Manager table after 2NF:

Third Normal Form (3NF): According to this rule, all the transitive dependencies must be removed from the table and every non-prime attribute must be dependent on primary key.

New tables after 3NF:

Boycc Codd Normal Form (BCNF): A table is said to be in BCNF, if it is in 3NF and if it does not have multiple overlapping candidate keys.

## 2.2 – ER Diagram for English Premier League with all entity sets, attributes and relationships and cardinality constraints

2.3 –relational data model, 2.4 – create tables using sql (ddl) commands.

Table: Manager

## 2.5 – Screenshots from SQL Server Management Studio (IDE)

Get Complete Solution From Best Locus Assignment Experts.

## 3.1 - Benefits of using manipulation and query tools in a relational database system

Alter table statement: This statement helps us change a value in a database table. E.g.: altertable manager1 addprimarykey(manager_id); Drop table statement: This statement helps us to delete a database table along with its schema. E.g.: droptable manager1; SQL DML commands – DML stands for data manipulation language. These help us to manage data with schema objects. Insert statement, update statement and delete statement are three major commands which help us to manipulate with the date in the database tables. Insert Statement : It helps us to insert data or records in database table. E.g.–To insert a record of data into table manager we write following query in SQL Server Management Studio IDE: Insertinto manager (manager_id,manager_fname,manager_lname,manager_dob,manager_pob) values(1,'Neil','Armstrong','february 27,1984','london') Delete Statement: It helps us to delete data from database table. E.g. – To delete a record of data from table manager, we write following query in SQL Server Management Studio IDE: deletefrom manager where manager_id = 2; Update Statement : This statement helps us to modify data from database table. Eg – To update a record of data from table manager, we have to write following query in SQL Server Management Studio IDE: update manager set manager_fname ='Edwin'where manager_id ='1'; By above query, the record with manager_id = 1 will update, manager_fname to Edwin.

## 3.2 – Populate tables with some data and answer queries –

• List the managers of each team.
• Manager_fname, Manager_lname from manager
• team On manager.manager_id = team.manager_id;
• Output the full name of the top scorer in the league with the number of goals scored. Selecttop 1 players. player_fname, players.player_lname, players.score from players             Innerjoin team             On players.game_id = team.game_id             Orderby team.score desc
• Output the average number of goals per game of all games played.
• AVG(score)as'Average Score'from game
• List all of the games that were played in a particular ‘city’. Select game_id from Game where city = ‘Manchester’;
• List all players who have played a game that was refereed by a particular referee (choose any referee). Select players.player_fname, players.player_lname from players, game, team, refree Where game.refree_id = team.refree_id and players.player_id = team.player_id and team.refree_id  = refree.refree_id and refree_fname ='smith';
• Create a database view for your system to produce a list of fixtures yet to be played along with the date of each game. Createview fixture as Select game_id,date             From game             Where(date >='')

## 3.3 - Critically evaluate the validity of the data extracted using the above queries and comment on the design process followed to ensure that meaningful data is extracted through the use of query tools.

All the queries mentioned above have been executed using SQL Server Management Studio IDE, and are thus successful in retrieving relevant information. By making use of SQL DDL (Data definition language) and DML (Data Manipulation Language), meaningful data can be fetch in IDE.

## 4.1 - Critically review and test the relational database system designed for the given scenario and thus provide a documentation supporting implementation and testing of the relational database system developed.

There are six entities in this given scenario based upon which I have designed the relational database for the system. Manager, Players, Coaches, Refree, Team, Game. I have developed the relational database for this system by defining tables with primary key and foreign key constraints. Each entity and relationship has been converted into table, and attributes into corresponding fields. The database thus developed has important data which must be tested from time to time for any discrepancy. In day to day basis, databases are accessed by thousands of users in an organization. These databases even have important functionality like stored procedures, stored functions, triggers, views, instances, queries etc. There must be a complete regression test suite, to be run over the database periodically. There are number of reasons why we must test our database on regular basis:

• Database is an important asset, by which hundreds or thousands of users are associated. If there is any loss of data, people associated with it will be affected and they may have to incur huge loss.
• Databases of large organizations incorporate mission critical data in them, which contains their business functionality, so it is essential to secure it from unidentified access and any unforeseen disasters.
• Testing tells us if there are any defects in the system, and whether we must take any steps to remove those errors.
• If we make changes or modifications in our database, then sometimes it may result in some errors like overlapping, redundancy or missing data. Regression testing helps us detect those errors.

## 4.2 - Create brief user documentation for the relational database system developed.

According to given scenario, English premier league has managers, players, and coaches, refree. Teams participate in English premier league and play games. Team can be either home team or away team. The players who play the game can be goalkeeper, midfielder, defender, and striker. The entire database developed for this scenario is normalized up to 3nf (3 rd normal form) and thus there is no transitive dependency or redundancy. In this project, I have created 6 database tables: Manager, Players, Coach, Refree, Team, and Game. Description of tables is as follows:

• Manager : Manager_id (pk), manager_fname, manager_lname, manager_dob, manager_pob
• Players : Player_id(pk), player_fname, player_lname, player_dob, player_pob, score, game_id(fk)
• Coach : Coach_id(pk), coach_fname, coach_lname
• Refree : Refree_id(pk), refree_fname, refree_lname
• Team : Team_id(pk), team_name, score, city, manager_id(fk), coach_id(fk), player_id(fk), score, refree_id(fk)
• Game: Game_id(pk), team_id(fk), refree_id(fk), date, score

Where pk represents primary key and fk represents foreign key.

## 4.3 - By giving a tabular V&V document, explain how verification and validation has been addressed.

We have V-Model in software development, popularly known as verification and validation model. It is executed in sequential manner and each phase must be completed before the next phase begins. These tasks are performed to identify how consumers or users perceives the final product or software. Is the final product satisfy the quality standards and is it good to be used.

## 4.4 - Explain control mechanisms and show how these techniques have been used in developing your system.

Control mechanisms are methods or process to define and manage variables in a desirable way. For example a test manager at deployment site might install a variety of control mechanisms to help them monitor various testing activities in software testing life cycle. We need to apply control mechanisms over our system to ensure that a high standard of quality is met. To keep the entire process on track, it is important to have quality control. It helps us to figure out the problem, fix it and also helps us in judging the effectiveness of implemented solution. The entire process of quality control must go on smoothly. A control feature must be devised, to manage major portion of work in standardized manner. If in the entire process of controlling the system work flow, we are not able to achieve any task by straight method, we need to devise some alternative control mechanisms, which comply with the standardized methods. In the process of controlling, one must also know how to react in case of defects. The areas of system where defects are most likely to occur must be monitored carefully, so that the defects can be caught and fixed there and then, rather than continuing the work flow further. Defects must be caught and treated in their nascent stage or must be reduced to near zero, so that we can say that six sigma is attained.

## References:

• ER Diagram in DBMS [Online]. [Accessed on 17 November 2014]. Available on world wide web: <http://www.studytonight.com/dbms/er-diagram.php>
• Structured Query Language/Data Manipulation Language [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://en.wikibooks.org/wiki/Structured_Query_Language/Data_Manipulation_Language>
• Database Testing [Online].  [Accessed on 18 November 2014]. Available on world wide web <http://www.agiledata.org/essays/databaseTesting.html>
• Why to test a database [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.softwaretestinggenius.com/why-should-we-test-a-relational-database>
• What is verification and validation [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.softwaretestinghelp.com/what-is-verification-and-validation/>
• Six Sigma control phase [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.managementstudyguide.com/six-sigma-control-phase.htm>
• Normalization of Database [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.studytonight.com/dbms/database-normalization.php>
• Database design strategies [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://databasemanagement.wikia.com/wiki/Database_Design_Strategies>
• Difference between DBMS and File System [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.differencebetween.com/difference-between-dbms-and-vs-file-system/>
• Difference between file processing system and database management system [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://studychacha.com/discuss/135910-difference-between-file-processing-system-database-management-system.html>

