cheatsheet

> data structure
data ~ tables ~ spreadsheets

  • row = records
  • columns = fields
    • each col has a data type (integer, varchar (text), date, …)

> basic SQL commands

  • SELECT retrieve data from one or more tables list the col to display
  • FROM which table, can be aliases? ?? ❓
  • AS rename final column output
  • WITH table AS query
  • INSERT add new records (rows) to a table
  • UPDATE modify existing records (rows) in a table
  • DELETE remove records (rows) from a table
  • CREATE, ALTER, DROP manage db objects like tables and schemas

> filtering and sorting

  • WHERE clause filters results based on certain conditions
  • ORDER BY sorts data by specified columns
  • GROUP BY groups rows sharing values in specified columns for agg functions (SUM, AVG, …)

> relationships and joins

  • JOIN ... ON
  • INNER JOIN
  • LEFT JOIN
  • RIGHT JOIN
  • FULL JOIN
  • USING

> advanced features

  • COUNT
  • SUM, MAX, MIN, AVG

> logic, condition, …

  • OR, AND, NOT, BETWEEN, LIKE
  • IS NULL vs. IS NOT NULL
  • IN vs. NOT IN

> others?

  • LIMIT
  • CASE
CASE
WHEN ... THEN ... 
WHEN ... THEN ...
ELSE ... 
END AS varname
  • concat texts var1 || ' ' || var2
  • convert var::text
  • search substring var::text LIKE '%SUBSTRING'
  • substr the first 6 char
    • LEFT(var, 6)
  • /*... */ or -- = comments

SQL Arithmetic Operators

OperatorDescriptionExample
+Add
-Subtract
*Multiply
/Divide
%Modulo (remainder of division)

SQL Bitwise Operators

OperatorDescription
&Bitwise AND
|Bitwise OR
^Bitwise exclusive OR

SQL Comparison Operators

OperatorDescriptionExample
=Equal to
>Greater than
<Less than
>=Greater than or equal to
Less than or equal to
<>Not equal to

SQL Logical Operators

OperatorDescriptionExample
ALLTRUE if all of the subquery values meet the condition
ANDTRUE if all the conditions separated by AND is TRUE
ANYTRUE if any of the subquery values meet the condition
BETWEENTRUE if the operand is within the range of comparisons
EXISTSTRUE if the subquery returns one or more records
INTRUE if the operand is equal to one of a list of expressions
LIKETRUE if the operand matches a pattern
NOTDisplays a record if the condition(s) is NOT TRUE
ORTRUE if any of the conditions separated by OR is TRUE
SOMETRUE if any of the subquery values meet the condition
XORTRUE if only 1 is true

SQL Functions

source: https://sqlzoo.net/wiki/FUNCTIONS

FunctionDescriptionExample
% MODULORemainder of division (MOD())
vs. DIV() division operation
15 % 4 → 3
+ (string)String concatenation (in some DBs)
… or use CONCAT()
'abc' + 'def' → 'abcdef'
+ INTERVALAdd time interval to datedate + INTERVAL 7 DAY
+ (dates)Add days to datedate + 7 → date+7 days
ABSAbsolute valueABS(-5) → 5
AVGAverage of valuesAVG(price) → 24.99
CASEConditional logic
❗ don’t forget ‘END’
CASE WHEN x>0 THEN 'pos' ELSE 'neg' END
CASTConvert data typeCAST('123' AS INT) → 123
CEILRound up to nearest integer
vs. FLOOR (down)
CEIL(4.2) → 5
COALESCEReturn first non-NULL value
or REPLACE a NULL with a specific value
COALESCE(NULL, 5, 10) → 5
COALESCE(leader, 'UNDEFINED!')
CONCATConcatenate stringsCONCAT('a', 'b') → 'ab'
COSCosine of angleCOS(0) → 1
COUNTCount rowsCOUNT(*) → 42
CURRENT_DATEToday’s dateCURRENT_DATE → 2026-02-17
CURRENT_TIMESTAMPCurrent date and timeCURRENT_TIMESTAMP → 2026-02-17 14:30:00
DATEPARTExtract date partDATEPART(year, date) → 2026
DAYDay of monthDAY('2026-02-17') → 17
DIVInteger division
vs. % = remainder
15 DIV 4 → 3
EXTRACTExtract date partEXTRACT(YEAR FROM date) → 2026
FLOORRound down to nearest int
vs. CEIL (up)
FLOOR(4.7) → 4
HOURHour from timeHOUR('14:30:00') → 14
IFNULLReturn alternative if NULLIFNULL(col, 0) → col or 0
INSTRFind substring positionINSTR('hello', 'll') → 3
LEFTLeftmost characters (substring)LEFT('hello', 2) → 'he'
LENString length (SQL Server)LEN('hello') → 5
LENGTHString lengthLENGTH('hello') → 5
LOWERLower caseLOWER('France') → 'france'
MAXMaximum valueMAX(salary) → 150000
MINMinimum valueMIN(salary) → 35000
MINUTEMinute from timeMINUTE('14:30:45') → 30
MODRemainder (modulo)
… same as %
MOD(15, 4) → 3
MONTHMonth numberMONTH('2026-02-17') → 2
NULLIFNULL if values equalNULLIF(5, 5) → NULL
NVLReplace NULL (Oracle)NVL(col, 0) → col or 0
PATINDEXPattern positionPATINDEX('%ll%', 'hello') → 3
POSITIONSubstring positionPOSITION('ll' IN 'hello') → 3
QUARTERQuarter of yearQUARTER('2026-02-17') → 1
RANKRank with gapsRANK() OVER(ORDER BY score) → 1,1,3
REPLACEReplace substringREPLACE('hello', 'l', 'x') → 'hexxo'
RIGHTRightmost characters (substring)RIGHT('hello', 2) → 'lo'
ROUNDRound number (param can be negative)ROUND(4.25, 1) → 4.3
SECONDSeconds from timeSECOND('14:30:45') → 45
SINSine of angleSIN(0) → 0
SUBSTRSubstring (nope with sqlserver)SUBSTR('hello', 2, 3) → 'ell'
SUBSTRINGExtract substring (compatible w all)SUBSTRING('hello', 2, 3) → 'ell'
SUBSTRING(ansi)ANSI standard substringSUBSTRING('hello' FROM 2 FOR 3) → 'ell'
SUMSum of valuesSUM(amount) → 1500.50
TANTangent of angleTAN(0) → 0
TO_CHAR(dates)Format date as stringTO_CHAR(date, 'YYYY-MM-DD') → '2026-02-17'
TRIMRemove spacesTRIM(' hello ') → 'hello'
UPPERUpper caseUPPER('france') → 'FRANCE'
YEARYear from dateYEAR('2026-02-17') → 2026

sqlbolt tutorial

1-5) SELECT queries 101

WHERE and FROM to get started + operators
  • WHERE clause for constraints
SELECT column, another_column, …
FROM mytable
WHERE condition
    AND/OR another_condition
    AND/OR …;
OperatorConditionSQL Example
=, !=, <, , >, >=Standard numerical operatorscol_name != 4
BETWEEN … AND …Number is within range of two values (inclusive)col_name BETWEEN 1.5 AND 10.5
NOT BETWEEN … AND …Number is not within range of two values (inclusive)col_name NOT BETWEEN 1 AND 10
IN (…)Number exists in a listcol_name IN (2, 4, 6)
NOT IN (…)Number does not exist in a listcol_name NOT IN (1, 3, 5)
OperatorConditionExample
=Case sensitive exact string comparison (notice the single equals)col_name = “abc”
!= or <>Case sensitive exact string inequality comparisoncol_name != “abcd”
LIKECase insensitive exact string comparisoncol_name LIKE “ABC”
NOT LIKECase insensitive exact string inequality comparisoncol_name NOT LIKE “ABCD”
%Used anywhere in a string to match a sequence of zero or more characters (only with LIKE or NOT LIKE)col_name LIKE “%AT%”
(matches “AT”, “ATTIC”, “CAT” or even “BATS”)
_Used anywhere in a string to match a single character (only with LIKE or NOT LIKE)col_name LIKE “AN_”
(matches “AND”, but not “AN”)
IN (…)String exists in a listcol_name IN (“A”, “B”, “C”)
NOT IN (…)String does not exist in a listcol_name NOT IN (“D”, “E”, “F”)
ORDER BY and LIMIT and OFFSET for filtering
  • Filtering and sorting Query results
SELECT DISTINCT column, another_column, …
FROM mytable
WHERE condition(s)
ORDER BY column ASC/DESC
LIMIT num_limit OFFSET num_offset;
IS NULL
WHERE column IS/IS NOT NULL

6-7) Multi-table queries with JOINs

JOIN and ON for default (INNER) JOIN

Tables that share information about a single entity need to have a primary key that identifies that entity uniquely across the database. One common primary key type is an auto-incrementing integer (because they are space efficient), but it can also be a string, hashed value, so long as it is unique.

SELECT column, another_table_column, …
FROM mytable
INNER JOIN another_table 
    ON mytable.id = another_table.id
WHERE condition(s)
ORDER BY column, … ASC/DESC
LIMIT num_limit OFFSET num_offset;

The INNER JOIN is a process that matches rows from the first table and the second table which have the same key (as defined by the ON constraint) to create a result row with the combined columns from both tables. After the tables are joined, the other clauses we learned previously are then applied.

  • INNER JOIN = JOIN cause it’s the default one!
LEFT/RIGHT/FULL for OUTER JOIN
  • vs. OUTER JOIN = LEFT/RIGHT/FULL JOINs

If the two tables have asymmetric data, which can easily happen when data is entered in different stages, then we would have to use a LEFT JOINRIGHT JOIN or FULL JOIN instead to ensure that the data you need is not left out of the results.

SELECT column, another_table_column, …
FROM mytable
INNER/LEFT/RIGHT/FULL JOIN another_table 
    ON mytable.id = another_table.matching_id
WHERE condition(s)
ORDER BY column, … ASC/DESC
LIMIT num_limit OFFSET num_offset;

Like the INNER JOIN these three new joins have to specify which column to join the data on.

  • LEFT JOIN: When joining table A to table B, a LEFT JOIN simply includes rows from A regardless of whether a matching row is found in B.
  • RIGHT JOIN: … the same, but reversed, keeping rows in B regardless of whether a match is found in A.
  • FULL JOIN: rows from both tables are kept, regardless of whether a matching row exists in the other table.

9-11) Queries with expressions, aggregates

AS for aliases and expressions
SELECT col_expression AS expr_description, …
FROM mytable;
 
SELECT particle_speed / 2.0 AS half_particle_speed
FROM physics_data
WHERE ABS(particle_position) * 10.0 > 500;
 
SELECT column AS better_column_name, …
FROM a_long_widgets_table_name AS mywidgets
INNER JOIN widget_sales
  ON mywidgets.id = widget_sales.widget_id;
AGG_FUNC and GROUP BY for aggregates
SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression
GROUP BY column;
  • The GROUP BY clause works by grouping rows that have the same value in the column specified.
    • If you don’t add this, the result can be much different cause it assumes operation on all the table!
FunctionDescription
COUNT(*), COUNT(column)A common function used to counts the number of rows in the group if no column name is specified.

Otherwise, count the number of rows in the group with non-NULL values in the specified column.
MIN(column)Finds the smallest numerical value in the specified column for all rows in the group.
MAX(column)Finds the largest numerical value in the specified column for all rows in the group.
AVG(column)Finds the average numerical value in the specified column for all rows in the group.
SUM(column)Finds the sum of all numerical values in the specified column for the rows in the group.
GROUP BY and HAVING for additional conditions

Luckily, SQL allows us to do this by adding an additional HAVING clause which is used specifically with the GROUP BY clause to allow us to filter grouped rows from the result set.

The HAVING clause constraints are written the same way as the WHERE clause constraints, and are applied to the grouped rows. With our examples, this might not seem like a particularly useful construct, but if you imagine data with millions of rows with different properties, being able to apply additional constraints is often necessary to quickly make sense of the data.

SELECT group_by_column,
AGG_FUNC(column_expression) AS aggregate_result_alias, …
FROM mytable
WHERE condition
GROUP BY column
HAVING group_condition;

If you aren’t using the GROUP BY clause, a simple WHERE clause will suffice.

12) Order of execution of a Query

SELECT DISTINCT column, AGG_FUNC(column_or_expression), …
FROM mytable
    JOIN another_table
      ON mytable.column = another_table.column
    WHERE constraint_expression
    GROUP BY column
    HAVING constraint_expression
    ORDER BY column ASC/DESC
    LIMIT count OFFSET COUNT;

… Because each part of the query is executed sequentially, it’s important to understand the order of execution so that you know what results are accessible where.

13-15) Row operations

We previously described a table in a database as a

  • two-dimensional set of rows and columns,
  • with the columns being the properties and
  • the rows being instances
  • of the entity in the table.

In SQL, the database schema is what describes the structure of each table, and the datatypes that each column of the table can contain. (e.g. Integer, Strong, …)

This fixed structure is what allows a database to be efficient, and consistent despite storing millions or even billions of rows.

INSERT to add rows

When inserting data into a database, we need to use an INSERT statement, which declares which table to write into, the columns of data that we are filling, and one or more rows of data to insert. In general, each row of data you insert should contain values for every corresponding column in the table. You can insert multiple rows at a time by just listing them sequentially.

INSERT INTO mytable
VALUES (value_or_expr, another_value_or_expr, …),
       (value_or_expr_2, another_value_or_expr_2, …),
       …;

In some cases, if you have incomplete data and the table contains columns that support default values, you can insert rows with only the columns of data you have by specifying them explicitly.

INSERT INTO mytable
(column, another_column, …)
VALUES (value_or_expr, another_value_or_expr, …),
      (value_or_expr_2, another_value_or_expr_2, …),
      …;
UPDATE and SET to update rows

In addition to adding new data, a common task is to update existing data, which can be done using an UPDATE statement. Similar to the INSERT statement, you have to specify exactly which table, columns, and rows to update. In addition, the data you are updating has to match the data type of the columns in the table schema.

UPDATE mytable
SET column = value_or_expr, 
    other_column = another_value_or_expr, 

WHERE condition;

Most people working with SQL will make mistakes updating data at one point or another. Whether it’s updating the wrong set of rows in a production database, or accidentally leaving out the WHERE clause (which causes the update to apply to all rows), you need to be extra careful when constructing UPDATE statements.

One helpful tip is to always write the constraint first and test it in a SELECT query to make sure you are updating the right rows, and only then writing the column/value pairs to update.

DELETE FROM to delete rows
DELETE FROM mytable
WHERE condition;

16-18) Table operations

CREATE TABLE IF NOT EXISTS to create tables
CREATE TABLE IF NOT EXISTS mytable (
    column DataType TableConstraint DEFAULT default_value,
    another_column DataType TableConstraint DEFAULT default_value,

);

Here’s an example

CREATE TABLE movies (
    id INTEGER PRIMARY KEY,
    title TEXT,
    director TEXT,
    year INTEGER, 
    length_minutes INTEGER
);
Table data types
Data typeDescription
INTEGERBOOLEANThe integer datatypes can store whole integer values like the count of a number or an age. In some implementations, the boolean value is just represented as an integer value of just 0 or 1.
FLOATDOUBLEREALThe floating point datatypes can store more precise numerical data like measurements or fractional values. Different types can be used depending on the floating point precision required for that value.
CHARACTER(num_chars)VARCHAR(num_chars)TEXTThe text based datatypes can store strings and text in all sorts of locales. The distinction between the various types generally amount to underlaying efficiency of the database when working with these columns.

Both the CHARACTER and VARCHAR (variable character) types are specified with the max number of characters that they can store (longer values may be truncated), so can be more efficient to store and query with big tables.
DATEDATETIMESQL can also store date and time stamps to keep track of time series and event data. They can be tricky to work with especially when manipulating data across timezones.
BLOBFinally, SQL can store binary data in blobs right in the database. These values are often opaque to the database, so you usually have to store them with the right metadata to requery them.
Table constraints
ConstraintDescription
PRIMARY KEYThis means that the values in this column are unique, and each value can be used to identify a single row in this table.
AUTOINCREMENTFor integer values, this means that the value is automatically filled in and incremented with each row insertion. Not supported in all databases.
UNIQUEThis means that the values in this column have to be unique, so you can’t insert another row with the same value in this column as another row in the table. Differs from the PRIMARY KEY in that it doesn’t have to be a key for a row in the table.
NOT NULLThis means that the inserted value can not be NULL.
CHECK (expression)This allows you to run a more complex expression to test whether the values inserted are valid. For example, you can check that values are positive, or greater than a specific size, or start with a certain prefix, etc.
FOREIGN KEYThis is a consistency check which ensures that each value in this column corresponds to another value in a column in another table.

For example, if there are two tables, one listing all Employees by ID, and another listing their payroll information, the FOREIGN KEY can ensure that every row in the payroll table corresponds to a valid employee in the master Employee list.
ALTER to alter tables

As your data changes over time, SQL provides a way for you to update your corresponding tables and database schemas by using the ALTER TABLE statement to add, remove, or modify columns and table constraints.

ALTER TABLE mytable
ADD column DataType OptionalTableConstraint 
    DEFAULT default_value;
ADD to add column
ALTER TABLE mytable
ADD column DataType OptionalTableConstraint 
    DEFAULT default_value;
DROP to remove column
ALTER TABLE mytable
DROP column_to_be_deleted;
RENAME to rename table
ALTER TABLE mytable
RENAME TO new_table_name;
DROP TABLE IF EXISTS to drop tables
DROP TABLE IF EXISTS mytable;

19) Subqueries

Use WHERE clause to subquery things

  • They are basically nested queries, they can be referenced inside:
    • FROM, JOIN, WHERE or HAVING
      Here’s an example

”… find out which of your Associates are costing the company more than the average revenue brought per Associate”

SELECT *
FROM sales_associates
WHERE salary > 
   (SELECT AVG(revenue_generated)
    FROM sales_associates);
Correlated subqueries

a.k.a. “synchronised sub-query”

A more powerful type of subquery is the correlated subquery in which the inner query references, and is dependent on, a column or alias from the outer query. Unlike the subqueries above, each of these inner queries need to be run for each of the rows in the outer query, since the inner query is dependent on the current outer query row.

Here’s an example

“Instead of the list of just Sales Associates above, imagine if you have a general list of Employees, their departments (engineering, sales, etc.), revenue, and salary. This time, you are now looking across the company to find the employees who perform worse than average in their department.
For each employee, you would need to calculate their cost relative to the average revenue generated by all people in their department. To take the average for the department, the subquery will need to know what department each employee is in:“

SELECT *
FROM employees
WHERE salary > 
   (SELECT AVG(revenue_generated)
    FROM employees AS dept_employees
    WHERE dept_employees.department = employees.department);
WHERE and IN… existence tests
SELECT *, …
FROM mytable
WHERE column
    IN/NOT IN (SELECT another_column
               FROM another_table);
 

20) Unions, intersections and exceptions

When working with multiple tables, the UNION and UNION ALL operator allows you to append the results of one query to another assuming that they have the same column count, order and data type. If you use the UNION without the ALL, duplicate rows between the tables will be removed from the result.

SELECT column, another_column
   FROM mytable
UNION / UNION ALL / INTERSECT / EXCEPT
SELECT other_column, yet_another_column
   FROM another_table
ORDER BY column DESC
LIMIT n;

Similar to the UNION, the INTERSECT operator will ensure that only rows that are identical in both result sets are returned, and the EXCEPT operator will ensure that only rows in the first result set that aren’t in the second are returned. This means that the EXCEPT operator is query order-sensitive, like the LEFT JOIN and RIGHT JOIN.

Both INTERSECT and EXCEPT also discard duplicate rows after their respective operations, though some databases also support INTERSECT ALL and EXCEPT ALL to allow duplicates to be retained and returned.

select star sql tutorial

Chapter 1

table: executions
columns:

  • first_name, last_name, ex_number, ex_age, ex_date, county, last_statement,

Even though we’ve capitalized SELECTFROM and LIMIT, SQL commands are not case-sensitive. You can see that the code editor recognizes them and formats them as a command no matter the capitalization. Nevertheless, I recommend capitalizing them to differentiate them from column names, table names and variables.
Column names, table names and variables are also not case-sensitive in this version of SQL, though they are case-sensitive in many other versions. To be on the safe side, I recommend always assuming they are case-sensitive.

SQL can evaluate the truth-value of almost anything WHERE 1 is valid and will return everything

This is because SQL is doing integer division. To do decimal division, at least one of the operands must be a decimal, for instance 51.0 / 2. A common trick is to multiply one number by 1.0 to convert it into a decimal. This will come in useful in the later chapters.

Chapter 2
CASE
  WHEN <clause> THEN <result>
  WHEN <clause> THEN <result>
  ...
  ELSE <result>
END

Here’s a weird example:

SELECT
  COUNT(CASE WHEN county='Harris' THEN 1
    ELSE NULL END),
  COUNT(CASE WHEN county='Bexar' THEN 1
    ELSE NULL END)
FROM executions

Dialects

Although we’ve called this a book about SQL, if we want to be pedantic it really is a book about SQLite. This is because SQL is an imaginary concept, a platonic ideal. In reality, there are only dialects that try to conform to the SQL specifications.

SQL is also under-specified, meaning that some functionality is not specified by the standards. For instance, the standards don’t say whether the string length-finding function should be called LEN (SQL Server) or LENGTH (SQLite); or how identifiers like table or column names should be quoted (` in MySQL, " in SQLite).

Here’s how to count the proportion of something:

SELECT COUNT(CASE WHEN last_statement LIKE '%innocent%' THEN 1
		    ELSE NULL END) * 100.0 / COUNT(*)
FROM executions
Chapter 3
SELECT
  last_statement IS NOT NULL AS has_last_statement,
  county,
  COUNT(*)
FROM executions
GROUP BY has_last_statement, county
  • you can GROUP BY multiple columns!

and now adding some HAVING extra condition on the GROUP BY

SELECT county, COUNT(*)
FROM executions
WHERE ex_age >= 50
GROUP BY county
HAVING COUNT(*) > 2;

Some examples of nested queries

SELECT first_name, last_name
FROM executions
WHERE LENGTH(last_statement) =
    (SELECT MAX(LENGTH(last_statement))
     FROM executions)
SELECT
  county,
  100.0 * COUNT(*) / (SELECT COUNT(*) FROM executions)
    AS percentage
FROM executions
GROUP BY county
ORDER BY percentage DESC

necessary whenever the things you want to retrieve cannot be processed with the main / outer query

Such a function would need to aggregate both within the groups (to get the numerator) and throughout the dataset (to get the denominator).
But each query either has a GROUP BY block or doesn’t. So what we really need are two separate queries, one which aggregates with a GROUP BY and another that aggregates without. We can then combine them using a technique called “nesting”.

To reiterate, nesting is necessary here because in the WHERE clause, as the computer is inspecting a row to decide if its last statement is the right length, it can’t look outside to figure out the maximum length across the entire dataset. We have to find the maximum length separately and feed it into the clause. Now let’s apply the same concept to find the percentage of executions from each county.

Chapter 4
  • <table1> JOIN <table2> ON <clause>

When joining tables that are derived from the same table, this is called a “self join” and is a powerful technique for allowing rows to get information from other parts of the same table.

SELECT
  last_ex_date AS start,
  ex_date AS end,
  JULIANDAY(ex_date) - JULIANDAY(last_ex_date)
    AS day_difference
FROM executions
JOIN (SELECT ex_number + 1 AS ex_number, 
		ex_date AS last_ex_date
		FROM executions
		WHERE ex_number < 553) previous
  ON executions.ex_number = previous.ex_number
ORDER BY day_difference DESC
LIMIT 10

which is equivalent to

SELECT
  previous.ex_date AS start,
  executions.ex_date AS end,
  JULIANDAY(executions.ex_date) - JULIANDAY(previous.ex_date)
    AS day_difference
FROM executions
JOIN executions previous
  ON executions.ex_number = previous.ex_number + 1
ORDER BY day_difference DESC
LIMIT 10

The big idea behind JOINs has been to create an augmented table because the original didn’t contain the information we needed. This is a powerful concept because it frees us from the limitations of a single table and allows us to combine multiple tables in potentially complex ways. We’ve also seen that with this extra complexity, meticulous bookkeeping becomes important. Aliasing tables, renaming columns and defining good JOIN ON clauses are all techniques that help us maintain order.

Chapter 5

Some of the major ones were:

  • Comparing the shape of available tables with the desired result to decide what aggregation to do.
  • Examining where the information we require resides. Is it in an adjacent row? Or a group? Or the entire dataset?
  • Interpreting queries in the right logical structure. For instance, viewing clauses as essentially a true or false; viewing <table1> JOIN <table2> ON ... as a great big table.

❗ The last exercices are less straightforward.

sqlzoo tutorial

  • ✅ SELECT basics

    • ✅ quiz
  • ✅ SELECT from world

    • ✅ quiz
  • ✅ SELECT from nobel

    • ✅ quiz
  • ✅ SELECT in SELECT

    • ✅ quiz
  • ✅ SUM and COUNT

    • ✅ quiz
  • ✅ JOIN

    • ✅ quiz
  • ✅ More JOIN

    • ✅ quiz
  • ✅ Using NULL

    • ✅ quiz
  • ✅ Self JOIN

    • ✅ quiz
  • ALL/ANY/IN comparisons: make sure there isn’t any NULL value otherwise, the result = NULL!

SELECT name
FROM world
WHERE gdp > ALL(SELECT gdp
                FROM world
                WHERE continent = 'Europe'
                AND gdp IS NOT NULL)
  • correlated sub-query =
    tables in inner and outer SELECT are the same,
    values from outer are used in inner SELECT
SELECT continent, name, area
FROM world x
WHERE area = (SELECT MAX(area) FROM world y
		      WHERE y.continent=x.continent)
  • To review…
SELECT DISTINCT continent
FROM world w1
WHERE NOT EXISTS (
    SELECT 1
    FROM world w2
    WHERE w2.continent = w1.continent
      AND w2.population > 25000000
)
SELECT name, continent
FROM world x
WHERE NOT EXISTS (
    SELECT 1
    FROM world y
    WHERE x.continent = y.continent
      AND x.name <> y.name
      AND x.population <= 3 * y.population
)
  • GROUP BY you NEED to include all non-aggregate SELECT columns in GROUP BY
SELECT game.id, game.mdate, COUNT(goal.matchid) AS goals_scored
FROM game
JOIN goal ON goal.matchid = game.id
WHERE game.team1 = 'POL' OR game.team2 = 'POL'
GROUP BY game.id, game.mdate
  • Those self joins … what a nightmare

thoughtspot tutorial

Basic SQL

✔ 15 lessons

Intermediate SQL

20 lessons

SQL Analytics Training

8 lessons

Advanced SQL

9 lessons

resources

Learn

Documentation

Practice