Homework #1 - SQL

对 Lecture #02: Advanced SQL 的一个简单测试,使用 SQLite 练习 SQL 语句。

Ubuntu 安装 SQLite:sudo apt-get install sqlite3

根据 Homework 界面下载数据集并进行配置,需要用到的 relations 结构如下:

relations
  • Q1

一个例子,没有分数,直接给了答案

但恕我没有看懂这 Details 和 Answer 怎么完全对不上

不影响,直接交就完事了

1
2
3
4
SELECT DISTINCT(language)
FROM akas
ORDER BY language
LIMIT 10;
  • Q2
1
2
3
4
5
SELECT primary_title, premiered, runtime_minutes || ' (mins)'
FROM titles
WHERE genres LIKE '%Sci-Fi%'
ORDER BY runtime_minutes DESC
LIMIT 10;

话说既然 Gradescope 评测结果里面已经有答案了能不能直接搬过来

  • Q3
1
2
3
4
5
SELECT name, 2022 - born AS age
FROM people
WHERE born >= 1900 AND died IS NULL
ORDER BY age DESC, name
LIMIT 20;

官方答案似乎连 died 也要算进去(不过不影响)

  • Q4

开始上强度了?

其实是因为没兴趣没怎么学 SQL 遭报应了

我更感兴趣的还是 DBMS 内部实现(

最后只能凭借着之前上的蜜汁数据库课(不如叫 Excel 课)以及搜索引擎硬写

1
2
3
4
5
SELECT name, COUNT(*) as num_appearances
FROM crew JOIN people ON crew.person_id = people.person_id
GROUP BY crew.person_id
ORDER BY num_appearances DESC
LIMIT 20;
  • Q5
1
2
3
4
5
6
7
8
9
10
SELECT
CAST(titles.premiered / 10 * 10 AS CHAR) || 's' AS decade,
ROUND(AVG(ratings.rating), 2) AS avg_rating,
ROUND(MAX(ratings.rating), 2),
ROUND(MIN(ratings.rating), 2),
COUNT(*)
FROM ratings INNER JOIN titles USING(title_id)
WHERE titles.premiered IS NOT NULL
GROUP BY decade
ORDER BY avg_rating DESC, decade;
  • Q6

嵌套!都可以嵌!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SELECT titles.primary_title,
ratings.votes
FROM titles
INNER JOIN ratings USING(title_id)
WHERE title_id IN (
SELECT DISTINCT(title_id)
FROM crew
WHERE person_id IN (
SELECT person_id
FROM people
WHERE name LIKE '%Cruise%'
and born = 1962
)
)
ORDER BY ratings.votes DESC
LIMIT 10;
  • Q7

如果先查出来是 2021 年可以简化代码吗,或者直接交答案

1
2
3
4
5
6
7
SELECT COUNT(DISTINCT title_id)
FROM titles
WHERE premiered IN (
SELECT premiered
FROM titles
WHERE primary_title = 'Army of Thieves'
);
  • Q8

感觉到后面都是一个套路了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
SELECT DISTINCT(people.name)
FROM crew
INNER JOIN people USING(person_id)
WHERE crew.title_id IN (
SELECT title_id
FROM crew
WHERE person_id IN (
SELECT person_id
FROM people
WHERE name = 'Nicole Kidman'
and born = 1967
)
)
AND (
crew.category = 'actor'
OR crew.category = 'actress'
)
ORDER BY people.name;
  • Q9

窗口函数还是看不太懂

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
WITH actor_avg_rating AS (
SELECT name,
person_id,
ROUND(AVG(rating), 2) as avg_rating
FROM people
INNER JOIN crew USING(person_id)
INNER JOIN titles USING(title_id)
INNER JOIN ratings USING(title_id)
WHERE born = 1955
AND titles.type = 'movie'
GROUP BY person_id
)
SELECT name,
avg_rating
FROM (
SELECT *,
NTILE(10) OVER(
ORDER BY avg_rating
) AS quartile
FROM actor_avg_rating
)
WHERE quartile = 9
ORDER BY avg_rating DESC,
name
LIMIT 10;
  • Q10

SQL 递归看得头疼,直接看答案了:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
with p as (
select titles.primary_title as name,
akas.title as dubbed
from titles
inner join akas on titles.title_id = akas.title_id
where titles.primary_title = "House of the Dragon"
AND titles.type = 'tvSeries'
group by titles.primary_title,
akas.title
order by akas.title
),
c as (
select row_number() over (
order by p.name asc
) as seqnum,
p.dubbed as dubbed
from p
),
flattened as (
select seqnum,
dubbed
from c
where seqnum = 1
union all
select c.seqnum,
f.dubbed || ', ' || c.dubbed
from c
join flattened f on c.seqnum = f.seqnum + 1
)
select dubbed
from flattened
order by seqnum desc
limit 1;

话说回来看见答案压缩包里面有个 bonus,但是在网站上没找到?

开始学原理了,SQL 太难用了(