LinuxMCE Forums

General => Users => Topic started by: bulek on September 20, 2013, 09:51:05 am

Title: Huge query in pluto_media : taking my disk down to crawling
Post by: bulek on September 20, 2013, 09:51:05 am
Hi,

I've posted a thread about freqent crashes when clicking on Video button on main Orbiter screen... Crash came out of data not being available in reasonable time - and cause is that my disk is so slooow...

But I guess this is not the HW problem, but more SW. if I do this :
Code: [Select]
hdparm -tT /dev/sda
/dev/sda:
 Timing cached reads:   3548 MB in  2.00 seconds = 1771.32 MB/sec
 Timing buffered disk reads:  10 MB in  3.17 seconds =   3.15 MB/sec

When this happens, I get frequent crashes in media browsers on Orbiter, have unresponsive Orbiter (needs 30+ secs to get from screensaver to Orbiter screen), etc....

When second parameter is >60MB/sec, everything is working much faster and ok...

Now what is slowing my disk down ?
when I do
Code: [Select]
iotop -oP I see that mysqld is producing massive amount of disk writes and reads (I always thought this is pretty normal, till Radu didn't check his machines and we compared...) - approx. 5-10 M/s...

when I check what mysql is doing with :
Code: [Select]
mysql
show full processlist;

I see that there is a massive sql query going on, probably taking all that massive amount of disk accesses...

Quote
|   594 | root   | localhost | pluto_media    | Query   |  1623 | Sorting result | SELECT PK_File, Path, Filename, INode,
MAX(PSC_MOD) As CurrentDbAttrDate,
(1000000 * SUM(BOOKMARKS) + 10000 * SUM(ATTRIBUTES) + 100 * SUM(LONG_ATTRIBUTES) + SUM(PICTURES)) AS CurrentDbAttrCount,
(SUM(ATTRIBUTES) + SUM(LONG_ATTRIBUTES)) AS HasAttributes,
AttrDate AS OldDbAttrDate, AttrCount AS OldDbAttrCount, ModificationDate AS OldFileDate,
Source
FROM
(
        (
                SELECT PK_File, Path, Filename, INode, Source, Missing,
                        AttrDate, AttrCount, ModificationDate,
                        MAX(IF(Bookmark.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Bookmark.psc_mod)) AS PSC_MOD,
                        COUNT(Bookmark.PK_Bookmark) AS BOOKMARKS,
                        0 AS ATTRIBUTES, 0 AS LONG_ATTRIBUTES, 0 AS PICTURES
                FROM File
                LEFT JOIN Bookmark ON Bookmark.FK_File = PK_File
                WHERE 1 = 1 AND (Path LIKE '/home/public/data%' OR (Path = '/home/public' AND Filename = 'data' )  OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )  OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' )  )  AND Missing = 0
                GROUP BY PK_File, Path, Filename, INode,  AttrDate, AttrCount, ModificationDate
        )
UNION
        (
                SELECT PK_File, Path, Filename, INode, Source, Missing,
                        AttrDate, AttrCount, ModificationDate,
                        MAX(IF(Attribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Attribute.psc_mod)) AS PSC_MOD,
                        0 AS BOOKMARKS,
                        COUNT(PK_Attribute) AS ATTRIBUTES,
                        0 AS LONG_ATTRIBUTES, 0 AS PICTURES
                FROM File
                LEFT JOIN File_Attribute ON File_Attribute.FK_File = PK_File
                LEFT JOIN Attribute ON File_Attribute.FK_Attribute = PK_Attribute
                WHERE 1 = 1 AND (Path LIKE '/home/public/data%' OR (Path = '/home/public' AND Filename = 'data' )  OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )  OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' )  )  AND Missing = 0
                GROUP BY PK_File, Path, Filename, INode,  AttrDate, AttrCount, ModificationDate
        )
UNION
        (
                SELECT PK_File, Path, Filename, INode, Source, Missing,
                        AttrDate, AttrCount, ModificationDate,
                        MAX(IF(LongAttribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), LongAttribute.psc_mod)) AS PSC_MOD,
                        0 AS BOOKMARKS, 0 AS ATTRIBUTES,
                        COUNT(PK_LongAttribute) AS LONG_ATTRIBUTES,
                        0 AS PICTURES
                FROM File
                LEFT JOIN LongAttribute ON LongAttribute.FK_File = PK_File
                WHERE 1 = 1 AND (Path LIKE '/home/public/data%' OR (Path = '/home/public' AND Filename = 'data' )  OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )  OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' )  )  AND Missing = 0
                GROUP BY PK_File, Path, Filename, INode,  AttrDate, AttrCount, ModificationDate
        )
UNION
        (
                SELECT PK_File, Path, Filename, INode, Source, Missing,
                        AttrDate, AttrCount, ModificationDate,
                        MAX(IF(Picture_File.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Picture_File.psc_mod)) AS PSC_MOD,
                        0 AS BOOKMARKS, 0 AS ATTRIBUTES, 0 AS LONG_ATTRIBUTES,
                        COUNT(FK_Picture) AS PICTURES
                FROM File
                LEFT JOIN Picture_File ON Picture_File.FK_File = PK_File
                WHERE 1 = 1 AND (Path LIKE '/home/public/data%' OR (Path = '/home/public' AND Filename = 'data' )  OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )  OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' )  )  AND Missing = 0
                GROUP BY PK_File, Path, Filename, INode, AttrDate, AttrCount, ModificationDate
        )
) AS FILE_UNION
WHERE 1 = 1 AND (Path LIKE '/home/public/data%' OR (Path = '/home/public' AND Filename = 'data' )  OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )  OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' )  )  AND Missing = 0
GROUP BY PK_File, Path, Filename, INode, AttrDate, AttrCount, ModificationDate |

My first question (dillema) is whether I'm chasing a ghost here and this is pretty normal or is there something to be checked further...

I'm on Dianemo S, but I guess this part of the system is pretty similar to LMCE also...

Any advice would help me a lot. My system is useless when this happens and it happens quite often (or all the time)...

Thanks in advance,

regards,

Bulek.

Edit: I think that Radu found the origin of this query (or similar one when we talked about it) here :
http://svn.linuxmce.org/trac.cgi/browser/trunk/src/UpdateMedia/MediaState.cpp    line 59....


Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: Marie.O on September 20, 2013, 02:22:17 pm
Disable UpdateMedia and see if that helps your throughput, ie.

/usr/pluto/bin/UpdateMediaDaemonControl -disable

Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: tschak909 on September 20, 2013, 05:45:40 pm
Are any of these disks on USB? Or are you using PCI cards to attach the disks? How big is your library?

-Thom
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: bulek on September 21, 2013, 12:58:32 am
Hi,

both disks are sata. First is system disk, the second is for media (2T)... First one is about half full, second aroung 75%. I doubt I have the biggest media collection. When disk is fast enough (when that huge query is not slowing it down), then Orbiter doesn't crash. It probably gets all data within timeout... But I have a feeling, that all media is retrieved at that moment, although only subset will be shown on screen...

I've disabled UpdateMediaDaemon from command line (Radu just gave me a hint that check in web-admin is broken) and will see if that query will come back again....



Thanks in advance,

regards,

Bulek.
 
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: tschak909 on September 21, 2013, 02:00:51 am
I am currently working with someone to see if we can't optimize that query. This one is going to take a long while to fix... this query is a clusterfuck.

-Thom
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: tschak909 on September 21, 2013, 03:33:02 am
To give you an idea of how crazy of a rabbit hole, this is.. I've spent the last three and a half hours with a friend well versed in SQL. We were able to compact the query, but the results actually wound up being slower than when we started.

I could REALLY use a database designer's help here.

-Thom
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: bulek on September 21, 2013, 10:20:49 am
Hi,

I'm sorry I cannot help on this one. Thanks for your efforts...

Regards,

Bulek.
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: jamo on September 21, 2013, 01:02:36 pm
To give you an idea of how crazy of a rabbit hole, this is.. I've spent the last three and a half hours with a friend well versed in SQL. We were able to compact the query, but the results actually wound up being slower than when we started.

I could REALLY use a database designer's help here.

-Thom
Thom... did you have any luck there? I ran that query on my DB and I got
3017 rows in set (0.40 sec)
Obviously my media db is pretty modest but that's pretty snappy all the same... . Bulek, how big is yours and how long does it take if you run the query manually from the mysql prompt?
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: jamo on September 21, 2013, 02:22:00 pm
Hmmm... don't have time to check if the data is exactly the same but it is certainly the same number of rows and results in about a 25% speed improvement for me... perhaps someone else can try the following query rewrite to compare with the original:
Code: [Select]
SELECT PK_File, Path, Filename, INode,
  greatest(
    IF(Bookmark.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Bookmark.psc_mod),
    IF(Attribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Attribute.psc_mod),
    IF(LongAttribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), LongAttribute.psc_mod),
    IF(Picture_File.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Picture_File.psc_mod),
    CAST('0000-00-00 00:00:00' AS DATE)
  ) As CurrentDbAttrDate,
  (1000000 * COUNT(Bookmark.PK_Bookmark) + 10000 * COUNT(Attribute.PK_Attribute) +
   100 * COUNT(LongAttribute.PK_LongAttribute) + COUNT(Picture_File.FK_Picture))
  AS CurrentDbAttrCount,
  COUNT(Attribute.PK_Attribute)+COUNT(LongAttribute.PK_LongAttribute) AS HasAttributes,
  AttrDate AS OldDbAttrDate, AttrCount AS OldDbAttrCount, ModificationDate AS OldFileDate,
  Source
FROM
File
LEFT JOIN Bookmark ON Bookmark.FK_File = PK_File
LEFT JOIN File_Attribute ON File_Attribute.FK_File = PK_File
LEFT JOIN Attribute ON File_Attribute.FK_Attribute = PK_Attribute
LEFT JOIN LongAttribute ON LongAttribute.FK_File = PK_File
LEFT JOIN Picture_File ON Picture_File.FK_File = PK_File
  WHERE 1 = 1 AND (Path LIKE '/home/public/data%' OR (Path = '/home/public' AND Filename = 'data' )  OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )  OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' )  )  AND Missing = 0
GROUP BY PK_File, Path, Filename, INode, AttrDate, AttrCount, ModificationDate;
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: Marie.O on September 21, 2013, 02:26:37 pm
Could paste the output of the following commands:

select count(*) from Picture_File;

select count(*) from File;

select count(*) from File_Attribute;

select count(*) from LongAttribute;


EDIT: My results:

Code: [Select]
mysql> select count(*) from Picture_File;
+----------+
|     4285 |
+----------+
mysql> select count(*) from File;
+----------+
|    16715 |
+----------+
mysql> select count(*) from File_Attribute;
+----------+
|    35065 |
+----------+
mysql> select count(*) from LongAttribute;
+----------+
|     1043 |
+----------+
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: tschak909 on September 21, 2013, 07:00:06 pm
That is the very same query that my SQL friend and I came up with, last night. I suddenly went from 9 seconds, to 14.5 seconds. So for me, it's slower.

-Thom
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: tschak909 on September 21, 2013, 08:06:25 pm
I ran an explain select on jamo's query:

Code: [Select]
+----+-------------+----------------+--------+-----------------------------+---------+---------+-----------------------------------------+-------+----------------------------------------------+
| id | select_type | table          | type   | possible_keys               | key     | key_len | ref                                     | rows  | Extra                                        |
+----+-------------+----------------+--------+-----------------------------+---------+---------+-----------------------------------------+-------+----------------------------------------------+
|  1 | SIMPLE      | File           | ALL    | Path_Filename,Path,Filename | NULL    | NULL    | NULL                                    | 21015 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | Bookmark       | ALL    | NULL                        | NULL    | NULL    | NULL                                    |   387 |                                              |
|  1 | SIMPLE      | File_Attribute | ref    | PRIMARY,FK_File             | FK_File | 4       | pluto_media.File.PK_File                |     2 | Using index                                  |
|  1 | SIMPLE      | Attribute      | eq_ref | PRIMARY                     | PRIMARY | 4       | pluto_media.File_Attribute.FK_Attribute |     1 |                                              |
|  1 | SIMPLE      | LongAttribute  | ALL    | NULL                        | NULL    | NULL    | NULL                                    |   262 |                                              |
|  1 | SIMPLE      | Picture_File   | ref    | FK_File                     | FK_File | 4       | pluto_media.File.PK_File                |     1 |                                              |
+----+-------------+----------------+--------+-----------------------------+---------+---------+-----------------------------------------+-------+----------------------------------------------+

The filesort basically shows that for the bulk of our query, we are HAMMERING the filesystem...

-Thom
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: jamo on September 21, 2013, 09:19:41 pm
Code: [Select]
mysql> select count(*) from Picture_File;
+----------+
| count(*) |
+----------+
|      154 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from File;
+----------+
| count(*) |
+----------+
|     4333 |
+----------+
1 row in set (0.00 sec)

mysql> select count(*) from File_Attribute;
+----------+
| count(*) |
+----------+
|    11794 |
+----------+
1 row in set (0.01 sec)

mysql> select count(*) from LongAttribute;
+----------+
| count(*) |
+----------+
|       22 |
+----------+
1 row in set (0.00 sec)

Yes, you have a lot more media than I do (and more pictures!)
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: jamo on September 21, 2013, 09:22:11 pm
Thom

Did some research on that explain plan you posted and I think the problem was the group-by clause that groups by a bunch of dependent fields that aren't indexed. If we group on the PK of File and then join back to file to get the fields we need it seems to take care of that. In fact, it improves my query by quite a dramatic amount. Does anyone else want to try this?

Code: [Select]
SELECT File.PK_File, File.Path, File.Filename, File.Inode,
  s.CurrentDbAttrDate, s.CurrentDbAttrCount, s.HasAttributes,
  File.AttrDate AS OldDbAttrDate, File.AttrCount AS OldDbAttrCount,
  File.ModificationDate AS OldFileDate, File.Source
FROM
(SELECT PK_File,
  greatest(
    IF(Bookmark.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Bookmark.psc_mod),
    IF(Attribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Attribute.psc_mod),
    IF(LongAttribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), LongAttribute.psc_mod),
    IF(Picture_File.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Picture_File.psc_mod),
    CAST('0000-00-00 00:00:00' AS DATE)
  ) As CurrentDbAttrDate,
  (1000000 * COUNT(Bookmark.PK_Bookmark) + 10000 * COUNT(Attribute.PK_Attribute) +
   100 * COUNT(LongAttribute.PK_LongAttribute) + COUNT(Picture_File.FK_Picture))
  AS CurrentDbAttrCount,
  COUNT(Attribute.PK_Attribute)+COUNT(LongAttribute.PK_LongAttribute) AS HasAttributes
FROM
File
LEFT JOIN Bookmark ON Bookmark.FK_File = PK_File
LEFT JOIN File_Attribute ON File_Attribute.FK_File = PK_File
LEFT JOIN Attribute ON File_Attribute.FK_Attribute = PK_Attribute
LEFT JOIN LongAttribute ON LongAttribute.FK_File = PK_File
LEFT JOIN Picture_File ON Picture_File.FK_File = PK_File
  WHERE 1 = 1 AND (Path LIKE '/home/public/data%' OR (Path = '/home/public' AND Filename = 'data' )  OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )  OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' )  )  AND Missing = 0
GROUP BY PK_File) s
inner join File on s.PK_File=File.PK_File;
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: tschak909 on September 21, 2013, 09:37:02 pm
Shaved off about 4 seconds, to 11.90 seconds:

Code: [Select]
+----+-------------+----------------+--------+-----------------------------+---------+---------+-----------------------------------------+-------+----------------------------------------------+
| id | select_type | table          | type   | possible_keys               | key     | key_len | ref                                     | rows  | Extra                                        |
+----+-------------+----------------+--------+-----------------------------+---------+---------+-----------------------------------------+-------+----------------------------------------------+
|  1 | PRIMARY     | <derived2>     | ALL    | NULL                        | NULL    | NULL    | NULL                                    | 20292 |                                              |
|  1 | PRIMARY     | File           | eq_ref | PRIMARY                     | PRIMARY | 4       | s.PK_File                               |     1 |                                              |
|  2 | DERIVED     | File           | ALL    | Path_Filename,Path,Filename | NULL    | NULL    | NULL                                    | 21015 | Using where; Using temporary; Using filesort |
|  2 | DERIVED     | Bookmark       | ALL    | NULL                        | NULL    | NULL    | NULL                                    |   386 |                                              |
|  2 | DERIVED     | File_Attribute | ref    | PRIMARY,FK_File             | FK_File | 4       | pluto_media.File.PK_File                |     2 | Using index                                  |
|  2 | DERIVED     | Attribute      | eq_ref | PRIMARY                     | PRIMARY | 4       | pluto_media.File_Attribute.FK_Attribute |     1 |                                              |
|  2 | DERIVED     | LongAttribute  | ALL    | NULL                        | NULL    | NULL    | NULL                                    |   262 |                                              |
|  2 | DERIVED     | Picture_File   | ref    | FK_File                     | FK_File | 4       | pluto_media.File.PK_File                |     1 |                                              |
+----+-------------+----------------+--------+-----------------------------+---------+---------+-----------------------------------------+-------+----------------------------------------------+
8 rows in set (12.18 sec)
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: jamo on September 21, 2013, 09:41:37 pm
explain on my side
Code: [Select]
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
| id | select_type | table          | type   | possible_keys      | key     | key_len | ref                                     | rows | Extra       |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
|  1 | PRIMARY     | <derived2>     | ALL    | NULL               | NULL    | NULL    | NULL                                    | 3017 |             |
|  1 | PRIMARY     | File           | eq_ref | PRIMARY,FatGroupBy | PRIMARY | 4       | s.PK_File                               |    1 |             |
|  2 | DERIVED     | File           | index  | Path,Filename      | PRIMARY | 4       | NULL                                    | 3231 | Using where |
|  2 | DERIVED     | Bookmark       | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | File_Attribute | ref    | PRIMARY,FK_File    | FK_File | 4       | pluto_media.File.PK_File                |    2 | Using index |
|  2 | DERIVED     | Attribute      | eq_ref | PRIMARY            | PRIMARY | 4       | pluto_media.File_Attribute.FK_Attribute |    1 |             |
|  2 | DERIVED     | LongAttribute  | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | Picture_File   | ref    | FK_File            | FK_File | 4       | pluto_media.File.PK_File                |    1 |             |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
8 rows in set (0.10 sec)
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: bushtech on September 21, 2013, 10:41:39 pm
Ran it on my core. New 1004 installation so fairly clean. 60 odd dvd's on media drive currently. Using a ssd as OS driveRan it on my core. New 1004 installation so fairly clean. 60 odd dvd's on media drive currently. Using a ssd as OS drive


Code: [Select]

| id | select_type | table          | type   | possible_keys      | key     | key_len | ref                                     | rows | Extra       |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
|  1 | PRIMARY     | <derived2>     | ALL    | NULL               | NULL    | NULL    | NULL                                    |  542 |             |
|  1 | PRIMARY     | File           | eq_ref | PRIMARY,FatGroupBy | PRIMARY | 4       | s.PK_File                               |    1 |             |
|  2 | DERIVED     | File           | index  | Path,Filename      | PRIMARY | 4       | NULL                                    |  162 | Using where |
|  2 | DERIVED     | Bookmark       | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | File_Attribute | ref    | PRIMARY,FK_File    | FK_File | 4       | pluto_media.File.PK_File                |    3 | Using index |
|  2 | DERIVED     | Attribute      | eq_ref | PRIMARY            | PRIMARY | 4       | pluto_media.File_Attribute.FK_Attribute |    1 |             |
|  2 | DERIVED     | LongAttribute  | ref    | FK_File            | FK_File | 5       | pluto_media.File.PK_File                |    1 |             |
|  2 | DERIVED     | Picture_File   | ref    | FK_File            | FK_File | 4       | pluto_media.File.PK_File                |    1 |             |
+----+-------------+----------------+--------+--------------------+---------+---------+-----------------------------------------+------+-------------+
8 rows in set (0.02 sec)



Jamo asked me to post this link to his explain code:  http://pastebin.com/jnQsJGAc











Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: bulek on September 21, 2013, 11:15:05 pm
Hi,

here are the results for my database:

Quote
mysql> select count(*) from Picture_File;
+----------+
| count(*) |
+----------+
|     3999 |
+----------+
1 row in set (0.04 sec)

mysql> select count(*) from File;
+----------+
| count(*) |
+----------+
|  2422647 |
+----------+
1 row in set (1.37 sec)

mysql> select count(*) from File_Attribute;
+----------+
| count(*) |
+----------+
|    41181 |
+----------+
1 row in set (0.34 sec)

mysql> select count(*) from LongAttribute;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.02 sec)


Regards,
Bulek.
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: tschak909 on September 22, 2013, 01:36:17 am
After letting updatemedia crawl my DB from a clean wipe of the pluto_media db, The resulting query time still is approximately 10 seconds.

I have put my pluto_media.sql here: https://96.226.245.101/work/pluto_media.sql ... if anyone wants to look.

-Thom
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: golgoj4 on September 22, 2013, 06:57:53 am

SELECT COUNT( * ) FROM FILE - 28847  Query took 0.0001 sec

SELECT COUNT( * )  FROM Picture_File - 6358 - is instant a time


select count(*) from File_Attribute; -  105293   - no seriously, is it?

1004 running samsung SSD




Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: tschak909 on September 22, 2013, 08:37:09 pm
I think the purpose of posde asking for these numbers is to find where the bottlenecks would be in the large query. Asking for a linear return of a table, without any joins, will _always_ be fast.

-Thom
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: buckle on September 25, 2013, 04:41:40 am
Making sure all the join conditions are indexed should help.

Code: [Select]
CREATE INDEX IX_Bookmark_FK_File
    ON Bookmark (FK_File);

CREATE INDEX IX_LongAttribute_FK_File
    ON LongAttribute (FK_File);

Using Thom's database on my machine I went from 20 seconds to < 3 seconds.


Typically I would check to make sure I had properly indexed anything in a where clause as well.

Adding one for Missing should also improve the query performance, although the difference for me was negligible.

Code: [Select]
CREATE INDEX IX_File_Missing
ON File(Missing);

FileName and Path indexes did not yield anything for me.  Here they are anyway...
Code: [Select]
CREATE INDEX ix_File_Path
ON File;

CREATE INDEX ix_File_FileName
ON File;

Hope this helps.

-buckle
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: Marie.O on September 25, 2013, 04:05:35 pm
Missing should have already been implemented.
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: phenigma on September 27, 2013, 01:10:56 am
Here's a version that I'm testing now.  It fixes the CurrentDbAttrCount and HasAttributes values that were not reporting correctly in jamo's post.  I am no one as far as mysql goes but I've been bashing on this for a couple hours.  

The missing index was not implemented on my db.

Thanks jamo, buckle, bulek, tschak, and others for working on this query!  This has been a great collaboration!

Code: [Select]
SELECT
  File.PK_File,
  File.Path,
  File.Filename,
  File.Inode,  
  s.CurrentDbAttrDate,
  s.CurrentDbAttrCount,
  s.HasAttributes,  
  File.AttrDate AS OldDbAttrDate,
  File.AttrCount AS OldDbAttrCount,    
  File.ModificationDate AS OldFileDate,
  File.Source

FROM
(
  SELECT
    PK_File,
    greatest(
      (IF(Bookmark.psc_mod IS NULL, CAST('0000-00-00 00:00:00' AS DATE),Bookmark.psc_mod)),
      (IF(Attribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Attribute.psc_mod)),
      (IF(LongAttribute.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), LongAttribute.psc_mod)),
      (IF(Picture_File.psc_mod IS NULL,CAST('0000-00-00 00:00:00' AS DATE), Picture_File.psc_mod)) ) As CurrentDbAttrDate,
    (select count(*) from Bookmark where FK_File=PK_File) as BOOKMARKCOUNT,
    (select count(*) from File_Attribute where FK_File=PK_File) as ACOUNT,    
    (select count(*) from LongAttribute where FK_File=PK_File) as LONGATTRCOUNT,
    (select count(*) from Picture_File where FK_File=PK_File) as IMGCOUNT,
    ((1000000 * (select BOOKMARKCOUNT)) +
      (10000 * (select ACOUNT)) +    
      (100 * (select LONGATTRCOUNT)) +
      (select IMGCOUNT)) as CurrentDbAttrCount,
    ((select ACOUNT) + (select LONGATTRCOUNT)) as HasAttributes

  FROM File
    LEFT JOIN Bookmark ON Bookmark.FK_File = PK_File
    LEFT JOIN File_Attribute ON File_Attribute.FK_File = PK_File
    LEFT JOIN Attribute ON File_Attribute.FK_Attribute = PK_Attribute
    LEFT JOIN LongAttribute ON LongAttribute.FK_File = PK_File
    LEFT JOIN Picture_File ON Picture_File.FK_File = PK_File

  WHERE 1 = 1
    AND ( Path LIKE '/home/public/data%' OR (Path ='/home/public' AND Filename = 'data' )  
      OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' )  
      OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' )  
      OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' )  
      OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )
      OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' )  
      OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' )  
      OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' )  
      OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' ) )  
    AND Missing = 0

    GROUP BY PK_File
) s

INNER JOIN File ON s.PK_File=File.PK_File;

J.
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: tschak909 on September 27, 2013, 05:56:34 am
Can we get at least one more successful test of this, before I plop this into UpdateMedia/MediaState.cpp ?

-Thom
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: buckle on September 27, 2013, 10:35:04 pm
I have a new query that also addressed the attribute counts issue with Jamo's version. 

I think it is about as simplified as I can make it.  I believe it was over-complicated from the beginning, at least for what the query set out to achieve.

Code: [Select]
SELECT
F.PK_File
,F.Path
,F.Filename
,F.Inode
,greatest(
IFNULL(B.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
IFNULL(A.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
IFNULL(LA.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
IFNULL(PF.max_psc_mod, CAST('0000-00-00 00:00:00' AS DATE)),
CAST('0000-00-00 00:00:00' AS DATE)
) As CurrentDbAttrDate
,(1000000 * IFNULL(B.Bookmark_Count, 0)
+ 10000 * IFNULL(A.Attribute_Count, 0)
+ 100 * IFNULL(LA.Long_Attribute_Count, 0)
+ IFNULL(PF.Picture_Count, 0)
) AS CurrentDbAttrCount
,IFNULL(A.Attribute_Count,0)+IFNULL(LA.Long_Attribute_Count,0) AS HasAttributes
,F.AttrDate AS OldDbAttrDate
,F.AttrCount AS OldDbAttrCount
,F.ModificationDate AS OldFileDate
,F.Source
FROM
File F
LEFT JOIN (select MAX(psc_mod) as max_psc_mod,  FK_File as FK_File, count(*) as Bookmark_Count from Bookmark group by FK_File) B ON B.FK_File = F.PK_File
LEFT JOIN (select MAX(Attrib.psc_mod) as max_psc_mod, count(*) as Attribute_Count, FileAttrib.FK_File as FK_File from File_Attribute FileAttrib left join Attribute Attrib ON FileAttrib.FK_Attribute = Attrib.PK_Attribute group by FK_File) A on A.FK_File = F.PK_File
LEFT JOIN (select MAX(psc_mod) as max_psc_mod, FK_File as FK_File, count(*) as Long_Attribute_Count from LongAttribute group by FK_File) LA ON LA.FK_File = F.PK_File
LEFT JOIN (select MAX(psc_mod) as max_psc_mod, FK_File as FK_File, count(*) as Picture_Count from Picture_File group by FK_File) PF ON PF.FK_File = F.PK_File
WHERE 1 = 1
AND ( Path LIKE '/home/public/data%' OR (Path ='/home/public' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_1/data%' OR (Path = '/home/user_1' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_2/data%' OR (Path = '/home/user_2' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_3/data%' OR (Path = '/home/user_3' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_4/data%' OR (Path = '/home/user_4' AND Filename = 'data' )
      OR Path LIKE '/home/user_5/data%' OR (Path = '/home/user_5' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_6/data%' OR (Path = '/home/user_6' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_7/data%' OR (Path = '/home/user_7' AND Filename = 'data' ) 
      OR Path LIKE '/home/user_8/data%' OR (Path = '/home/user_8' AND Filename = 'data' ) ) 
AND Missing = 0

I used phenigma's where clause for the sake of being thorough, and because it looked nice. :)

It compares exactly 100% against the original query in my test environment (from tschak's script). 

It performs pretty well, too.  Let me know how it works for you.

-buckle
Title: Re:
Post by: phenigma on September 28, 2013, 02:09:56 pm
Well...  As we found last night.  Buckle's query takes over 20 minutes to run on my 1004.  Jamo's with the additional corrections I made last night runs in 6 seconds.  The original takes over 25 seconds.  

http://pastebin.com/9XxjscCf <- jamo's with corrections

Sent from my SGH-I927 using Tapatalk 4
Title: Re:
Post by: phenigma on September 30, 2013, 05:38:41 am
I'm hoping we can get some more people to test the various versions and report back about the timings.

Sent from my SGH-I927 using Tapatalk 4
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: bushtech on September 30, 2013, 07:21:37 am
phenigma:

Ran jamo's query from your last post:

But I get this error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that       corresponds to your MySQL server version for the right syntax to use near 'ON PK      _Attribute=FK_Attribute WHERE FK_File=PK_File),
(select (IFNULL(MAX(LongAtt' at line 22
Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: jamo on September 30, 2013, 08:19:59 am
Bushtek...

there should be no space betweekn "PK" and "_Attribute". I think copying and pasting has introduced an error. It should read "ON PK_Attribute=FK_Attribute" or something like that.

phenigma:

Ran jamo's query from your last post:

But I get this error:

ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that       corresponds to your MySQL server version for the right syntax to use near 'ON PK      _Attribute=FK_Attribute WHERE FK_File=PK_File),
(select (IFNULL(MAX(LongAtt' at line 22

Title: Re: Huge query in pluto_media : taking my disk down to crawling
Post by: bushtech on September 30, 2013, 08:43:42 am
Thanks jamo

Error occurs when I copy & paste the Raw paste data. Works when I use the data in the top field. Lesson learnt ???
Title: Re:
Post by: phenigma on September 30, 2013, 02:14:37 pm
Likely a copy paste issue.  Edit:  Hmm, getting used to this Tapatalk app, and missing some posts apparently.  Ho hum.  Never mind me.

Sent from my SGH-I927 using Tapatalk 4