2019年12月12日 09:57 by wst
database昨天有产品给我反映,grafana中数据报表有错误,希望能给改改。
具体问题:1. 数据有重复; 2.字段显示错误
1. 打开报表的编辑功能,把sql粘出来。刚看吓我一跳,竟然有49行,连接了11个表。具体如下:
SELECT
tb_data.date_time AS 日期,
tb_data.ader_id AS 广告主ID,
tb_account.account_name AS 账户名,
tb_account.company_name AS 广告主名,
tb_industry1.industry_name AS 一级行业,
tb_industry2.industry_name AS 二级行业,
tb_group.account_name AS 管理员,
tb_data.order_id AS 广告组ID,
tb_order.order_name AS 广告组名称,
tb_data.ad_id AS 计划ID,
tb_ad.ad_name AS 计划名称,
tb_data.position_id AS 广告位ID,
tb_position.position_name AS 广告位名称,
tb_place_attr.ad_place_attr_name AS 广告位属性,
tb_media.media_name AS 媒体,
tb_position.channel_name AS 渠道,
tb_port_type.terminal_name AS 投放终端,
tb_data.display_pv AS 曝光PV,
tb_data.click_pv AS 点击PV,
tb_data.cost AS 花费,
tb_data.ctr AS 点击转化率,
tb_data.cpc AS 单次点击均价,
tb_data.cpm AS 千次曝光均价,
tb_data.sure_dl_pv AS 确定下载PV,
tb_data.cancel_dl_pv AS 取消下载PV,
tb_data.start_dl_pv AS 开始下载PV ,
tb_data.complete_dl_pv AS 完成下载PV,
tb_data.start_install_pv AS 开始安装PV,
tb_data.complete_install_pv AS 完成安装PV
FROM cache_creative_position_statistic AS tb_data
LEFT JOIN (SELECT DISTINCT account_id,account_name,company_name,industry1_id,industry2_id,group_id FROM nyg_account WHERE account_type in (1,2,3)) AS tb_account ON tb_data.ader_id=tb_account.account_id
LEFT JOIN nyg_order AS tb_order ON tb_data.order_id=tb_order.order_id
LEFT JOIN nyg_ad AS tb_ad ON tb_data.ad_id=tb_ad.ad_id
LEFT JOIN nyg_ad_position AS tb_position ON tb_data.position_id=tb_position.position_id
left join nyg_media AS tb_media on tb_position.media_id=tb_media.media_id
left join nyg_ad_place_attr AS tb_place_attr on tb_position.place_attr_id=tb_place_attr.ad_place_attr_id
left join nyg_terminal AS tb_port_type on tb_position.port_type=tb_port_type.terminal_id
LEFT JOIN nyg_account AS tb_group ON tb_account.group_id=tb_group.account_id
LEFT JOIN nyg_industry AS tb_industry1 ON tb_account.industry1_id=tb_industry1.industry_id
LEFT JOIN nyg_industry AS tb_industry2 ON tb_account.industry2_id=tb_industry2.industry_id
WHERE tb_data.order_id!='all'
AND tb_data.ad_id!='all'
AND tb_data.creative_id='all'
AND if(trim('')='',1=1,tb_data.ader_id='')
AND if(trim('')='',1=1,tb_account.account_name like '%%')
AND if(trim('')='',1=1,tb_data.order_id='')
AND if(trim('')='',1=1,tb_data.ad_id='')
AND tb_data.date_time BETWEEN FROM_UNIXTIME(1575886702) AND FROM_UNIXTIME(1576059502)
2. 这么多表,根本不知道错误在哪里。先分析内容逻辑关系:主表为cache_creative_position_statistic,其他都是辅助表。
3. 一步步排查,先把主表内容查出来,然后逐步的连接辅助表(同时调整表的连接顺序),看连接哪一个表的时候数据出现问题?然后就从分析那个表开始。
排查过程中的sql语句为:
语句1(先连接广告位表,然后注释掉部分字段):
SELECT
tb_data.date_time AS 日期,
tb_data.ader_id AS 广告主ID,
-- tb_account.account_name AS 账户名,
-- tb_account.company_name AS 广告主名,
-- tb_industry1.industry_name AS 一级行业,
-- tb_industry2.industry_name AS 二级行业,
-- tb_group.account_name AS 管理员,
tb_data.order_id AS 广告组ID,
tb_order.order_name AS 广告组名称,
tb_data.ad_id AS 计划ID,
-- tb_ad.ad_name AS 计划名称,
tb_data.position_id AS 广告位ID,
tb_position.position_name AS 广告位名称,
-- tb_place_attr.ad_place_attr_name AS 广告位属性,
-- tb_media.media_name AS 媒体,
tb_position.channel_name AS 渠道,
-- tb_port_type.terminal_name AS 投放终端,
tb_data.display_pv AS 曝光PV,
tb_data.click_pv AS 点击PV,
tb_data.cost AS 花费,
tb_data.ctr AS 点击转化率,
tb_data.cpc AS 单次点击均价,
tb_data.cpm AS 千次曝光均价,
tb_data.sure_dl_pv AS 确定下载PV,
tb_data.cancel_dl_pv AS 取消下载PV,
tb_data.start_dl_pv AS 开始下载PV ,
tb_data.complete_dl_pv AS 完成下载PV,
tb_data.start_install_pv AS 开始安装PV,
tb_data.complete_install_pv AS 完成安装PV
FROM cache_creative_position_statistic AS tb_data
LEFT JOIN nyg_ad_position AS tb_position ON tb_data.position_id=tb_position.position_id
-- LEFT JOIN (SELECT DISTINCT account_id,account_name,company_name,industry1_id,industry2_id,group_id
-- FROM nyg_account WHERE account_type in (1,2,3)) AS tb_account ON tb_data.ader_id=tb_account.account_id
LEFT JOIN nyg_order AS tb_order ON tb_data.order_id=tb_order.order_id
-- LEFT JOIN nyg_ad AS tb_ad ON tb_data.ad_id=tb_ad.ad_id
-- left join nyg_media AS tb_media on tb_position.media_id=tb_media.media_id
-- left join nyg_ad_place_attr AS tb_place_attr on tb_position.place_attr_id=tb_place_attr.ad_place_attr_id
-- left join nyg_terminal AS tb_port_type on tb_position.port_type=tb_port_type.terminal_id
-- LEFT JOIN nyg_account AS tb_group ON tb_account.group_id=tb_group.account_id
-- LEFT JOIN nyg_industry AS tb_industry1 ON tb_account.industry1_id=tb_industry1.industry_id
-- LEFT JOIN nyg_industry AS tb_industry2 ON tb_account.industry2_id=tb_industry2.industry_id
WHERE tb_data.order_id!='all'
AND tb_data.ad_id!='all'
AND tb_data.creative_id='all'
-- AND if(trim('')='',1=1,tb_data.ader_id='')
-- AND if(trim('')='',1=1,tb_account.account_name like '%%')
-- AND if(trim('')='',1=1,tb_data.order_id='')
-- AND if(trim('')='',1=1,tb_data.ad_id='')
AND tb_data.date_time BETWEEN FROM_UNIXTIME(1575879511) AND FROM_UNIXTIME(1576052311)
语句2(逐步的添加字段,发现tb_ad表数据有重复,改为使用distinct后的结果):
SELECT
tb_data.date_time AS 日期,
tb_data.ader_id AS 广告主ID,
tb_account.account_name AS 账户名,
tb_account.company_name AS 广告主名,
-- tb_industry1.industry_name AS 一级行业,
-- tb_industry2.industry_name AS 二级行业,
tb_group.account_name AS 管理员,
tb_data.order_id AS 广告组ID,
tb_order.order_name AS 广告组名称,
tb_data.ad_id AS 计划ID,
tb_ad.ad_name AS 计划名称,
tb_data.position_id AS 广告位ID,
tb_position.position_name AS 广告位名称,
tb_place_attr.ad_place_attr_name AS 广告位属性,
tb_media.media_name AS 媒体,
tb_position.channel_name AS 渠道,
tb_port_type.terminal_name AS 投放终端,
tb_data.display_pv AS 曝光PV,
tb_data.click_pv AS 点击PV,
tb_data.cost AS 花费,
tb_data.ctr AS 点击转化率,
tb_data.cpc AS 单次点击均价,
tb_data.cpm AS 千次曝光均价,
tb_data.sure_dl_pv AS 确定下载PV,
tb_data.cancel_dl_pv AS 取消下载PV,
tb_data.start_dl_pv AS 开始下载PV ,
tb_data.complete_dl_pv AS 完成下载PV,
tb_data.start_install_pv AS 开始安装PV,
tb_data.complete_install_pv AS 完成安装PV
FROM cache_creative_position_statistic AS tb_data
LEFT JOIN nyg_ad_position AS tb_position ON tb_data.position_id=tb_position.position_id
LEFT JOIN (SELECT DISTINCT account_id,account_name,company_name,industry1_id,industry2_id,group_id
FROM nyg_account WHERE account_type in (1,2,3)) AS tb_account ON tb_data.ader_id=tb_account.account_id
LEFT JOIN nyg_order AS tb_order ON tb_data.order_id=tb_order.order_id
LEFT JOIN (select distinct ad_id,ad_name from nyg_ad) AS tb_ad ON tb_data.ad_id=tb_ad.ad_id
left join nyg_media AS tb_media on tb_position.media_id=tb_media.media_id
left join nyg_ad_place_attr AS tb_place_attr on tb_position.place_attr_id=tb_place_attr.ad_place_attr_id
left join nyg_terminal AS tb_port_type on tb_position.port_type=tb_port_type.terminal_id
LEFT JOIN nyg_account AS tb_group ON tb_account.group_id=tb_group.account_id
-- LEFT JOIN nyg_industry AS tb_industry1 ON tb_account.industry1_id=tb_industry1.industry_id
-- LEFT JOIN nyg_industry AS tb_industry2 ON tb_account.industry2_id=tb_industry2.industry_id
WHERE tb_data.order_id!='all'
AND tb_data.ad_id!='all'
AND tb_data.creative_id='all'
-- AND if(trim('')='',1=1,tb_data.ader_id='')
-- AND if(trim('')='',1=1,tb_account.account_name like '%%')
-- AND if(trim('')='',1=1,tb_data.order_id='')
-- AND if(trim('')='',1=1,tb_data.ad_id='')
AND tb_data.date_time BETWEEN FROM_UNIXTIME(1575879511) AND FROM_UNIXTIME(1576052311)
语句3(全部调整完毕,结果能正确显示):
SELECT
tb_data.date_time AS 日期,
tb_data.ader_id AS 广告主ID,
tb_account.account_name AS 账户名,
tb_account.company_name AS 广告主名,
tb_industry1.industry_name AS 一级行业,
tb_industry2.industry_name AS 二级行业,
tb_group.account_name AS 管理员,
tb_data.order_id AS 广告组ID,
tb_order.order_name AS 广告组名称,
tb_data.ad_id AS 计划ID,
tb_ad.ad_name AS 计划名称,
tb_data.position_id AS 广告位ID,
tb_position.position_name AS 广告位名称,
tb_place_attr.ad_place_attr_name AS 广告位属性,
tb_media.media_name AS 媒体,
tb_position.channel_name AS 渠道,
tb_port_type.terminal_name AS 投放终端,
tb_data.display_pv AS 曝光PV,
tb_data.click_pv AS 点击PV,
tb_data.cost AS 花费,
tb_data.ctr AS 点击转化率,
tb_data.cpc AS 单次点击均价,
tb_data.cpm AS 千次曝光均价,
tb_data.sure_dl_pv AS 确定下载PV,
tb_data.cancel_dl_pv AS 取消下载PV,
tb_data.start_dl_pv AS 开始下载PV ,
tb_data.complete_dl_pv AS 完成下载PV,
tb_data.start_install_pv AS 开始安装PV,
tb_data.complete_install_pv AS 完成安装PV
FROM cache_creative_position_statistic AS tb_data
LEFT JOIN nyg_ad_position AS tb_position ON tb_data.position_id=tb_position.position_id
LEFT JOIN (SELECT DISTINCT account_id,account_name,company_name,industry1_id,industry2_id,group_id
FROM nyg_account WHERE account_type in (1,2,3)) AS tb_account ON tb_data.ader_id=tb_account.account_id
LEFT JOIN nyg_order AS tb_order ON tb_data.order_id=tb_order.order_id
LEFT JOIN (select distinct ad_id,ad_name from nyg_ad) AS tb_ad ON tb_data.ad_id=tb_ad.ad_id
left join nyg_media AS tb_media on tb_position.media_id=tb_media.media_id
left join nyg_ad_place_attr AS tb_place_attr on tb_position.place_attr_id=tb_place_attr.ad_place_attr_id
left join nyg_terminal AS tb_port_type on tb_position.port_type=tb_port_type.terminal_id
LEFT JOIN nyg_account AS tb_group ON tb_account.group_id=tb_group.account_id
LEFT JOIN nyg_industry AS tb_industry1 ON tb_account.industry1_id=tb_industry1.industry_id
LEFT JOIN nyg_industry AS tb_industry2 ON tb_account.industry2_id=tb_industry2.industry_id
WHERE tb_data.order_id!='all'
AND tb_data.ad_id!='all'
AND tb_data.creative_id='all'
AND if(trim('')='',1=1,tb_data.ader_id='')
AND if(trim('')='',1=1,tb_account.account_name like '%%')
AND if(trim('')='',1=1,tb_data.order_id='')
AND if(trim('')='',1=1,tb_data.ad_id='')
AND tb_data.date_time BETWEEN FROM_UNIXTIME(1575879511) AND FROM_UNIXTIME(1576052311)
在sql语句排查的过程中一般采用的方法为:
1. 逐步添加辅助表及其字段
2. 观察连接哪个表时出错,调整相关表;
3. 根据逻辑关系,调整表的连接顺序;