![]() |
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
![]() |

|
| Knowledge Base Most common error and how to trouble shoot them off |
![]() |
|
SQL - eliminate rows based on duplicate non-key info
|
LinkBack | Thread Tools | Display Modes |
|
|
#1 (permalink) |
|
Administrator
Posts: 875
Join Date: Oct 2005
Rep Power: 10
IM:
|
SELECT a.emplid ,a.test_id ,a.ls_data_source FROM ps_stdnt_test_comp a , ps_sa_tcmp_rel_tbl b WHERE a.test_id IN ('GRE','GMAT','TOEFL','CTOFL','TOEFL IBT','PRAX1','PRXC1') AND a.test_id = b.test_id AND a.test_component = b.test_component Result Test scenario looks like this: Emplid TestId Source EMPLID01 GRE SELF EMPLID01 GRE OFF EMPLID01 TOEFL SELF EMPLID01 TOEFL OFF I want to eliminate rows where the empl+testid occurs more than once, and any of the multiple rows has a source of OFF. Basically I want to show self-reported test scores until the official test score comes in, then only show the official row. Solution: I would think the subquery would have a much better chance to benefit from proper indexing. On the bad side, the subquery will in fact require an index of some sort to get any type of acceptable performance on a large table. But for the join, when you add this: AND aSelf.ls_data_source <> 'OFF' to the main WHERE clause, I think you reduce the chance of an index being used, I think you reduce the chance of an index being selected. It will be interesting to see actual results :-) Although it works with the limited data pulls I cannot TNC's to work when I add the additional fields to pull, I just get all the rows.; probably a quick fix. Here is what I am running (note the 'MAN' instead of 'OFF' - 'MAN is the actual database value). Again; it works until I toss in the fields i have commented out. Scott's works fine and is very quick. SELECT DISTINCT aSelf.emplid, aSelf.test_id, aSelf.test_component, COALESCE(aOff.ls_data_source, aSelf.ls_data_source) AS ls_data_source -- ,aself.score --,aself.score_letter -- ,aself.ext_acad_level -- ,aself.date_loaded -- ,aself.percentile -- ,aself.test_admin -- ,aself.test_index FROM ps_stdnt_test_comp aSelf LEFT JOIN ps_stdnt_test_comp aOff ON aSelf.emplid = aOff.emplid AND aSelf.test_id = aOff.test_id AND aSelf.test_component = aOff.test_component AND -- this is the new condition I added . . . add as many of these as is required to build the NK aOff.ls_data_source = 'MAN' where aself.emplid = '10947' SELECT a.emplid ,a.test_id ,a.test_dt ,a.ls_data_source ,b.descrshort ,a.score ,a.score_letter ,a.ext_acad_level ,a.date_loaded ,a.percentile ,a.test_admin ,a.test_index FROM ps_stdnt_test_comp a , ps_sa_tcmp_rel_tbl b WHERE a.test_id IN ('GRE','GMAT','TOEFL','CTOFL','TOEFL IBT','PRAX1','PRXC1') AND a.test_id = b.test_id AND a.test_component = b.test_component AND (a.ls_data_source = 'MAN' OR (NOT EXISTS(SELECT 1 FROM ps_stdnt_test_comp c WHERE c.emplid = a.emplid AND c.test_id = a.test_id AND c.test_component = a.test_component AND c.ls_data_source = 'MAN'))) |
|
|
|
|
|
|
![]() |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|