When you’re using a data warehouse, some actions get repeated over and over. We will take a look at four common algorithms used to deal with these situations. Most DWH (data warehouse) systems today are on a RDBMS (relational database management system) platform. These databases (Oracle, DB2, or Microsoft SQL Server) are widely used, easy to work with, and mature – a very important thing to bear in mind when choosing a platform.
Database schema migration is never an easy job. In fact, it can really be a headache, even when you’re working with a familiar system. For example, at times Oracle 10g may not drop the associated index for a primary key or unique constraint that has been dropped. In this article, I am going to explain when and why this happens. The Story: I’ve been working on the development of an e-commerce platform.
Having reference tables in your database is no big deal, right? You just need to tie a code or ID with a description for each reference type. But what if you literally have dozens and dozens of reference tables? Is there an alternative to the one-table-per-type approach? Read on to discover a generic and extensible database design for handling all your reference data. This unusual-looking diagram is a bird’s-eye view of a logical data model (LDM) containing all the reference types for an enterprise system.
We’ve had tremendously positive feedback on my recent article that talked about “Why SQL is neither legacy, nor low-level, nor difficult, nor the wrong place for (business) data logic, but simply awesome” both within the blog’s comment section as well as on reddit. However, one of the sections triggered very controversial feedback. Clearly, not everyone agreed to: Fallacy #5: The database is the wrong place for business logic
In the 3rd post in this series, we looked at how we prepare data for use with a concept called the Business Data Vault. Now, in this final part, I will show you the basics of how we project the Business Vault and Raw DV tables into star schemas which form the basis for our Information Marts. Raw Data Mart vs. Information Marts As of Data Vault 2.0, the terminology changed a bit to be more precise.
In my last post, we looked at the need for an Agile Data Engineering solution, issues with some of the current data warehouse modeling approaches, the history of data modeling in general, and Data Vault specifically. This time we get into the technical details of what the Data Vault Model looks like and how you build one. For my examples I will be using a simply Human Resources (HR) type model that most people should relate to (even if you have never worked with an HR model).
Why Talk About Errors? Model Setup 1 – Using Invalid Names 2 – Insufficient Column Width 3 – Not Indexing Properly 4 – Not Considering Possible Volume or Traffic 5 – Ignoring Time Zones 6 – Missing Audit Trail 7 – Ignoring Collation Why Talk About Errors? The art of designing a good database is like swimming. It is relatively easy to start and difficult to master.
Generating unique integers is a very common task in database systems. Many applications require each row in a given table to hold a unique value. One way to tackle this problem is to use sequences. What are Sequences? A sequence is a database object which allows users to generate unique integer values. The sequence is incremented every time a sequence number is generated. The incrementation occurs even if the transaction rolls back, which may result in gaps between numbers.
The concept of views and function-based indexes has been known for many years. One of the brand new solutions is a virtual column – a feature introduced in Oracle 11g. Apart from database giant, some well known DB vendors, like MariaDB and SQL Server, support the idea of computed columns. So let’s give virtual columns a try and examine their basic usage. Generally, there are two kinds of virtual columns:
Anyone who had to schedule an intercontinental phone call knows that there is no such thing as a simple time called now. What you should rather think about is a time comprised of here and now. The Earth rotates around its own axis. When it’s solar noon (the sun is at its highest position) in one place, it’s already past noon in places to the east and it’s still before noon in places to the west.
Oracle bases its language support on the values of parameters that begin with NLS. These parameters specify, for example, how to display currency or how the name of a day is spelled. The table below presents some of the NLS parameters. By using one of them, NLS_SORT, we can specify the sort method (binary or linguistic) for both SQL WHERE clause operations and NLSSORT function operations. Option Name Description NLS_LANG The current language, territory, and database character set, which are determined by session-wide globalization parameters.
Many features have arisen throughout the evolution of the Oracle database. In this article, let’s focus on temporary tables. They were introduced fairly late (Oracle 8i) and are now often considered to be indispensable for DBAs and developers. Description It’s worth mentioning up front that the table itself is not temporary, but rather the data within it. The data in such a table is stored only as long as the session or transaction lasts and is private for each session, however the definition is visible to all sessions.
Oracle’s Deviations From the Rules The most famous offering from Oracle is its relational database, which also brings Oracle the most profits. As the second-largest software giant (after Microsoft), Oracle has some of its own conventions which may or may not be consistent with other generally accepted conventions. Maybe that’s the reason why so many don’t understand Oracle. First of all, Oracle doesn’t follow the SQL standard. In many computer science courses, students practice and learn the fundamentals of databases using an Oracle database.
Synonyms are a very powerful feature of Oracle. They are auxiliary names that relate to other database objects: tables, procedures, views, etc. They work like Unix hard links; a pointer to point to an object that exists somewhere else. Synonyms can be created as PRIVATE (by default) or PUBLIC. public synonyms are available to all users in the database. private synonyms exist only in specific user schema (they are available only to a user and to grantees for the underlying object) The syntax for a synonym is as follows:
The Vertabelo journey continues … We now have almost 10,000 users and the number of Vertabelo advocates keeps growing strong. Vertabelo users come from over 100 countries and speak various languages. What unites them? The relational database. Let’s see what relational databases they use: We wanted to determine the most popular database engine among Vertabelo users based on one of three widely-used operating systems: Windows, Linux, Mac OS.
The concept of materialized views (MVs) is almost 15 years old; Oracle first introduced these views in the 8i version of its DBMS. However, some well known DB vendors (like MySQL) still don’t support MVs or have added this functionality only quite recently (it’s available in PostgreSQL since version 9.3, which was released just a year ago). In this article I’ll try to give you some tips about when you should use MVs in OLTP systems.
When looking at different kinds of ERD notations, it is hard not to come across Barker’s ERD notation, which is commonly used to describe data for Oracle. Richard Barker and his coworkers developed this ERD notation while working at the British consulting firm CACI around 1981, and when Barker joined Oracle, his notation was adopted. Let’s take a closer look at Barker’s syntax. The most important components in the ERD diagram are:
Some time ago, the Vertabelo Team participated in the PostgreSQL Conference Europe 2013. Some of the talks were really nice. One of them stuck in my head for quite a long time. It was Markus Winand’s lecture titled “Indexes: The neglected performance all-rounder.” Although I had had a solid background in databases, this 50 minutes long talk showed me that not everything concerning indexes was as clear to me as I had thought.
Generally, we don’t limit query results. However, when we only care about the first few rows or to implement table pagination, limiting query results is just what we need. Database vendors provide us with such functionality; most of them in their own distinct way. Example Let’s take a look at the 2014 Sochi Olympics Men’s Normal Hill Individual ski jumping results in the skijump_results table. There is no index on the skijump_results table.
If you were to implement a Top-N or pagination query in an Oracle database, you wouldn’t find any dedicated clause to limit the query result like TOP, LIMIT or FETCH FIRST. For each row returned by the query, Oracle provides a ROWNUM pseudocolumn that returns a number indicating the order in which the database selects the row from a table or set of joined views. Example Let’s take a look at the 2014 Sochi Olympics Men’s Normal Hill Individual ski jumping results in the skijump_results table.
When you create a foreign key in your database, you can specify what happens upon delete of the parent row. There are usually four possibilities: ON DELETE SET NULL ON DELETE CASCADE ON DELETE NO ACTION ON DELETE RESTRICT Today we’ll investigate the subtle difference between the last two options. In Some Databases There Is No Difference at All In Oracle, there is no RESTRICT keyword. The only option is NO ACTION.
I like Oracle database. It is efficient, easy to use for beginners and professionals – its tools for query analysis and optimization are masterpieces. But it has a very annoying “feature” – empty text (containing 0 characters) is stored in the database as null. Where did such a feature come from? Probably from the ancient ages, just after the dawn of time, i.e., the late 70's . In that age, memory (RAM and disks) was very limited and system designers did their best to use as little memory as possible.
Recently a fellow database architect claimed that in Oracle the type VARCHAR2(255) means a string of 255 bytes, not characters. There is not much difference between the two in the English-speaking world. It matters though if you want to handle people with names like Kołłątaj. (Not that Hugo Kołłątaj – a famous Polish 18th century politician – would ever use any of our systems, but he became our byword for all non-pure-ASCII names).
The question: Why doesn’t my script create tables? The other day I was testing Oracle SQL scripts generated by Vertabelo. Roughly, this is the code that was generated: ... -- Table: book CREATE TABLE book ( id integer NOT NULL, title varchar2(120) NOT NULL, isbn varchar2(15) NOT NULL, PRIMARY KEY (id) ) ; ... I used sqlplus to execute my script and see if it’s correct. sqlplus (database-details) The script run without errors but the tables where NOT created.