Author Topic: Huge query in pluto_media : taking my disk down to crawling  (Read 14826 times)

jamo

  • Guru
  • ****
  • Posts: 463
    • View Profile
    • LMCE user Page
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #15 on: September 21, 2013, 09:41:37 pm »
explain on my side
Code: [Select]
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
| id | select_type | table          | type   | possible_keys      | key     | key_len | ref                                     | rows | Extra       |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
|  1 | PRIMARY     | <derived2>     | ALL    | NULL               | NULL    | NULL    | NULL                                    | 3017 |             |
|  1 | PRIMARY     | File           | eq_ref | PRIMARY,FatGroupBy | PRIMARY | 4       | s.PK_File                               |    1 |             |
|  2 | DERIVED     | File           | index  | Path,Filename      | PRIMARY | 4       | NULL                                    | 3231 | Using where |
|  2 | DERIVED     | Bookmark       | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | File_Attribute | ref    | PRIMARY,FK_File    | FK_File | 4       | pluto_media.File.PK_File                |    2 | Using index |
|  2 | DERIVED     | Attribute      | eq_ref | PRIMARY            | PRIMARY | 4       | pluto_media.File_Attribute.FK_Attribute |    1 |             |
|  2 | DERIVED     | LongAttribute  | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | Picture_File   | ref    | FK_File            | FK_File | 4       | pluto_media.File.PK_File                |    1 |             |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
8 rows in set (0.10 sec)

bushtech

  • Veteran
  • ***
  • Posts: 98
    • View Profile
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #16 on: September 21, 2013, 10:41:39 pm »
Ran it on my core. New 1004 installation so fairly clean. 60 odd dvd's on media drive currently. Using a ssd as OS driveRan it on my core. New 1004 installation so fairly clean. 60 odd dvd's on media drive currently. Using a ssd as OS drive


Code: [Select]

| id | select_type | table          | type   | possible_keys      | key     | key_len | ref                                     | rows | Extra       |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
|  1 | PRIMARY     | <derived2>     | ALL    | NULL               | NULL    | NULL    | NULL                                    |  542 |             |
|  1 | PRIMARY     | File           | eq_ref | PRIMARY,FatGroupBy | PRIMARY | 4       | s.PK_File                               |    1 |             |
|  2 | DERIVED     | File           | index  | Path,Filename      | PRIMARY | 4       | NULL                                    |  162 | Using where |
|  2 | DERIVED     | Bookmark       | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | File_Attribute | ref    | PRIMARY,FK_File    | FK_File | 4       | pluto_media.File.PK_File                |    3 | Using index |
|  2 | DERIVED     | Attribute      | eq_ref | PRIMARY            | PRIMARY | 4       | pluto_media.File_Attribute.FK_Attribute |    1 |             |
|  2 | DERIVED     | LongAttribute  | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | Picture_File   | ref    | FK_File            | FK_File | 4       | pluto_media.File.PK_File                |    1 |             |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
8 rows in set (0.02 sec)



Jamo asked me to post this link to his explain code:  http://pastebin.com/jnQsJGAc











Core:Asus P8H77-V, Intel core i3-3240 LGA155, Inno 3D Nvidia 240 GT (DVI+VGA+HDMI), 4Gb ram, 125Gb SSD
MD: Zotax Zbox ID41

bulek

  • Administrator
  • wants to work for LinuxMCE
  • *****
  • Posts: 909
  • Living with LMCE
    • View Profile
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #17 on: September 21, 2013, 11:15:05 pm »
Hi,

here are the results for my database:

Quote
mysql> select count(*) from Picture_File;
+----------+
| count(*) |
+----------+
|     3999 |
+----------+
1 row in set (0.04 sec)

mysql> select count(*) from File;
+----------+
| count(*) |
+----------+
|  2422647 |
+----------+
1 row in set (1.37 sec)

mysql> select count(*) from File_Attribute;
+----------+
| count(*) |
+----------+
|    41181 |
+----------+
1 row in set (0.34 sec)

mysql> select count(*) from LongAttribute;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.02 sec)


Regards,
Bulek.
Thanks in advance,

regards,

Bulek.

tschak909

  • LinuxMCE God
  • ****
  • Posts: 5549
  • DOES work for LinuxMCE.
    • View Profile
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #18 on: September 22, 2013, 01:36:17 am »
After letting updatemedia crawl my DB from a clean wipe of the pluto_media db, The resulting query time still is approximately 10 seconds.

I have put my pluto_media.sql here: https://96.226.245.101/work/pluto_media.sql ... if anyone wants to look.

-Thom

golgoj4

  • NEEDS to work for LinuxMCE
  • ***
  • Posts: 1193
  • hrumpf!
    • View Profile
    • Mah Website
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #19 on: September 22, 2013, 06:57:53 am »

SELECT COUNT( * ) FROM FILE - 28847  Query took 0.0001 sec

SELECT COUNT( * )  FROM Picture_File - 6358 - is instant a time


select count(*) from File_Attribute; -  105293   - no seriously, is it?

1004 running samsung SSD




Linuxmce - Where everyone is never wrong, but we are always behind xbmc in the media / ui department.

tschak909

  • LinuxMCE God
  • ****
  • Posts: 5549
  • DOES work for LinuxMCE.
    • View Profile
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #20 on: September 22, 2013, 08:37:09 pm »
I think the purpose of posde asking for these numbers is to find where the bottlenecks would be in the large query. Asking for a linear return of a table, without any joins, will _always_ be fast.

-Thom

buckle

  • Veteran
  • ***
  • Posts: 68
    • View Profile
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #21 on: September 25, 2013, 04:41:40 am »
Making sure all the join conditions are indexed should help.

Code: [Select]
CREATE INDEX IX_Bookmark_FK_File
    ON Bookmark (FK_File);

CREATE INDEX IX_LongAttribute_FK_File
    ON LongAttribute (FK_File);

Using Thom's database on my machine I went from 20 seconds to < 3 seconds.


Typically I would check to make sure I had properly indexed anything in a where clause as well.

Adding one for Missing should also improve the query performance, although the difference for me was negligible.

Code: [Select]
CREATE INDEX IX_File_Missing
ON File(Missing);

FileName and Path indexes did not yield anything for me.  Here they are anyway...
Code: [Select]
CREATE INDEX ix_File_Path
ON File;

CREATE INDEX ix_File_FileName
ON File;

Hope this helps.

-buckle

Marie.O

  • Administrator
  • LinuxMCE God
  • *****
  • Posts: 3675
  • Wastes Life On LinuxMCE Since 2007
    • View Profile
    • My Home
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #22 on: September 25, 2013, 04:05:35 pm »
Missing should have already been implemented.

phenigma

  • LinuxMCE God
  • ****
  • Posts: 1758
    • View Profile
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #23 on: September 27, 2013, 01:10:56 am »
Here's a version that I'm testing now.  It fixes the CurrentDbAttrCount and HasAttributes values that were not reporting correctly in jamo's post.  I am no one as far as mysql goes but I've been bashing on this for a couple hours.  

The missing index was not implemented on my db.

Thanks jamo, buckle, bulek, tschak, and others for working on this query!  This has been a great collaboration!

Code: [Select]
SELECT
  File.PK_File,
  File.Path,
  File.Filename,
  File.Inode,  
  s.CurrentDbAttrDate,
  s.CurrentDbAttrCount,
  s.HasAttributes,  
  File.AttrDate AS OldDbAttrDate,
  File.AttrCount AS OldDbAttrCount,    
  File.ModificationDate AS OldFileDate,
  File.Source

FROM
(
  SELECT
    PK_File,
    greatest(
      (IF(Bookmark.psc_mod IS NULL, CAST('0000-00-00 00:00:00' AS DATE),Bookmark.psc_mod)),
      (IF(Attribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Attribute.psc_mod)),
      (IF(LongAttribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), LongAttribute.psc_mod)),
      (IF(Picture_File.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Picture_File.psc_mod)) ) As CurrentDbAttrDate,
    (select count(*) from Bookmark where FK_File=PK_File) as BOOKMARKCOUNT,
    (select count(*) from File_Attribute where FK_File=PK_File) as ACOUNT,    
    (select count(*) from LongAttribute where FK_File=PK_File) as LONGATTRCOUNT,
    (select count(*) from Picture_File where FK_File=PK_File) as IMGCOUNT,
    ((1000000 * (select BOOKMARKCOUNT)) +
      (10000 * (select ACOUNT)) +    
      (100 * (select LONGATTRCOUNT)) +
      (select IMGCOUNT)) as CurrentDbAttrCount,
    ((select ACOUNT) + (select LONGATTRCOUNT)) as HasAttributes

  FROM File
    LEFT JOIN Bookmark ON Bookmark.FK_File = PK_File
    LEFT JOIN File_Attribute ON File_Attribute.FK_File = PK_File
    LEFT JOIN Attribute ON File_Attribute.FK_Attribute = PK_Attribute
    LEFT JOIN LongAttribute ON LongAttribute.FK_File = PK_File
    LEFT JOIN Picture_File ON Picture_File.FK_File = PK_File

  WHERE 1 = 1
    AND ( Path LIKE '/home/public/data%' OR (Path ='/home/public' AND Filename = 'data' )  
      OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  
      OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  
      OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  
      OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )
      OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  
      OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  
      OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  
      OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' ) )  
    AND Missing = 0

    GROUP BY PK_File
) s

INNER JOIN File ON s.PK_File=File.PK_File;

J.

tschak909

  • LinuxMCE God
  • ****
  • Posts: 5549
  • DOES work for LinuxMCE.
    • View Profile
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #24 on: September 27, 2013, 05:56:34 am »
Can we get at least one more successful test of this, before I plop this into UpdateMedia/MediaState.cpp ?

-Thom

buckle

  • Veteran
  • ***
  • Posts: 68
    • View Profile
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #25 on: September 27, 2013, 10:35:04 pm »
I have a new query that also addressed the attribute counts issue with Jamo's version. 

I think it is about as simplified as I can make it.  I believe it was over-complicated from the beginning, at least for what the query set out to achieve.

Code: [Select]
SELECT
F.PK_File
,F.Path
,F.Filename
,F.Inode
,greatest(
IFNULL(B.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
IFNULL(A.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
IFNULL(LA.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
IFNULL(PF.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
CAST('0000-00-00 00:00:00' AS DATE)
) As CurrentDbAttrDate
,(1000000 * IFNULL(B.Bookmark_Count, 0)
+ 10000 * IFNULL(A.Attribute_Count, 0)
+ 100 * IFNULL(LA.Long_Attribute_Count, 0)
+ IFNULL(PF.Picture_Count, 0)
) AS CurrentDbAttrCount
,IFNULL(A.Attribute_Count,0)+IFNULL(LA.Long_Attribute_Count,0) AS HasAttributes
,F.AttrDate AS OldDbAttrDate
,F.AttrCount AS OldDbAttrCount
,F.ModificationDate AS OldFileDate
,F.Source
FROM
File F
LEFT JOIN (select MAX(psc_mod) as max_psc_mod,  FK_File as FK_File, count(*) as Bookmark_Count from Bookmark group by FK_File) B ON B.FK_File = F.PK_File
LEFT JOIN (select MAX(Attrib.psc_mod) as max_psc_mod, count(*) as Attribute_Count, FileAttrib.FK_File as FK_File from File_Attribute FileAttrib left join Attribute Attrib ON FileAttrib.FK_Attribute = Attrib.PK_Attribute group by FK_File) A on A.FK_File = F.PK_File
LEFT JOIN (select MAX(psc_mod) as max_psc_mod, FK_File as FK_File, count(*) as Long_Attribute_Count from LongAttribute group by FK_File) LA ON LA.FK_File = F.PK_File
LEFT JOIN (select MAX(psc_mod) as max_psc_mod, FK_File as FK_File, count(*) as Picture_Count from Picture_File group by FK_File) PF ON PF.FK_File = F.PK_File
WHERE 1 = 1
AND ( Path LIKE '/home/public/data%' OR (Path ='/home/public' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )
      OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' ) ) 
AND Missing = 0

I used phenigma's where clause for the sake of being thorough, and because it looked nice. :)

It compares exactly 100% against the original query in my test environment (from tschak's script). 

It performs pretty well, too.  Let me know how it works for you.

-buckle

phenigma

  • LinuxMCE God
  • ****
  • Posts: 1758
    • View Profile
Re:
« Reply #26 on: September 28, 2013, 02:09:56 pm »
Well...  As we found last night.  Buckle's query takes over 20 minutes to run on my 1004.  Jamo's with the additional corrections I made last night runs in 6 seconds.  The original takes over 25 seconds.  

http://pastebin.com/9XxjscCf <- jamo's with corrections

Sent from my SGH-I927 using Tapatalk 4
« Last Edit: September 28, 2013, 05:48:38 pm by phenigma »

phenigma

  • LinuxMCE God
  • ****
  • Posts: 1758
    • View Profile
Re:
« Reply #27 on: September 30, 2013, 05:38:41 am »
I'm hoping we can get some more people to test the various versions and report back about the timings.

Sent from my SGH-I927 using Tapatalk 4

bushtech

  • Veteran
  • ***
  • Posts: 98
    • View Profile
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #28 on: September 30, 2013, 07:21:37 am »
phenigma:

Ran jamo's query from your last post:

But I get this error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that       corresponds to your MySQL server version for the right syntax to use near 'ON PK      _Attribute=FK_Attribute WHERE FK_File=PK_File),
(select (IFNULL(MAX(LongAtt' at line 22
Core:Asus P8H77-V, Intel core i3-3240 LGA155, Inno 3D Nvidia 240 GT (DVI+VGA+HDMI), 4Gb ram, 125Gb SSD
MD: Zotax Zbox ID41

jamo

  • Guru
  • ****
  • Posts: 463
    • View Profile
    • LMCE user Page
Re: Huge query in pluto_media : taking my disk down to crawling
« Reply #29 on: September 30, 2013, 08:19:59 am »
Bushtek...

there should be no space betweekn "PK" and "_Attribute". I think copying and pasting has introduced an error. It should read "ON PK_Attribute=FK_Attribute" or something like that.

phenigma:

Ran jamo's query from your last post:

But I get this error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that       corresponds to your MySQL server version for the right syntax to use near 'ON PK      _Attribute=FK_Attribute WHERE FK_File=PK_File),
(select (IFNULL(MAX(LongAtt' at line 22