SQL实现用户之间关注问题
用户关注问题
场景:现有用户表 follows,两个字段 uid、to_uid,uid 表示当前用户,to_uid 表示关注的用户。
找出相互关注的用户
SELECT t1.uid,
t1.to_uid
FROM follows t1
JOIN follows t2
ON t1.uid = t2.to_uid AND t1.to_uid = t2.uid;
找出单向关注的用户
SELECT t1.uid,
t1.to_uid
FROM follows t1
JOIN follows t2
ON t1.uid = t2.to_uid AND t1.to_uid != t2.uid
用户关注表优化
当用户关注表 follows 非常大时,查询某个用户是否关注某个用户场景,查询速度回非常慢,解决办法是将用户合并。
问题重现
用户关注表原始数据表(user_follows): uid 关注 to_uid
id | uid | to_uid |
---|---|---|
0 | a | b |
1 | a | c |
2 | a | d |
3 | b | a |
4 | b | e |
5 | f | g |
- 问题:如果关注数据太大,会导致 user_follows 表非常大,每次查询会加载更多的数据
解决办法
新建一个表 user_follows_list,根据 type 来区分用户类型
SELECT uid1 uid
, type
, CAST(COLLECT_SET(uid2) AS string) user_list
, '${dt}' dt
FROM (
SELECT uid1
, uid2
, CASE
WHEN follow > 0 AND followed = 0 THEN 'follow'
WHEN follow = 0 AND followed > 0 THEN 'followed'
WHEN follow > 0 AND followed > 0 THEN 'both'
END type
FROM (
SELECT uid1
, uid2
, SUM(follow) follow
, SUM(followed) followed
FROM (
SELECT uid uid1
, follow_uid uid2
, 1 follow
, 0 followed
FROM ods_poke_user_follows
WHERE dt = TO_DATE(DATE_SUB(NOW(), 1))
UNION ALL
SELECT follow_uid uid1
, uid uid2
, 0 follow
, 1 followed
FROM user_follows
WHERE dt = TO_DATE(DATE_SUB(NOW(), 1))
) t1
GROUP BY uid1, uid2
) t1
) t1
GROUP BY uid1, type
实际效果
原始扩展
uid1 uid2 follow followed
a b 1 0
a c 1 0
a d 1 0
b a 1 0
b e 1 0
f g 1 0
b a 0 1
c a 0 1
d a 0 1
a b 0 1
e b 0 1
g f 0 1
聚合
a b both
a c follow
a d follow
b a both
b e follow
f g follow
c a followed
d a followed
e b followed
g f followed
结果
uid type user_list
a both [b]
a follow [c,d]
b both [a]
b follow [e]
f follow [g]
c followed [a]
d followed [a]
e followed [b]
g followed [f]
使用示例
-- uid 关注 user_list
select * from user_follows_list where type in ('follow', 'both')
-- user_list 关注 uid
select * from user_follows_list where type in ('followed', 'both')
-- user_list 和 uid 互关
select * from user_follows_list where type in ('both')