关于如何在Open edX中启用搜索特性可以参考我此前的评测文章:Open edX最新版评测与新特性探索(20151214)

官方也给出了这个特性的介绍:Searching the Course

以及技术文档:Enabling Open edX Search,如你所料,它一如既往地变更频繁和不够完备。怎么说呢,如果你和我一样乐观,你大概会觉得这是好事吧,它意味着这个项目在被持续、迅猛地推进

#Ready 关于edx search(这里重点谈论的是课程/内容搜索相关的部分),我们有个很有趣的想法,为了实现它,我们需要深入了解这个机制。而后在网上发现一些人也需要深度定制它。所以我将自己的分析过程分享出来,抛装引玉。也许能引来有趣的讨论,也许能帮到后来者,也许能给自己以启发

#Go ###资源汇集 首先我们把能找到的资源进行汇集

###阅读技术文档 通过阅读文档,我们大致了解了设计相关的信息:

  • edx-search本质上是一个 Django application,用于与edx-platform通信
  • 搜索的过程是通过先创建索引,然后在索引里检索匹配信息,如果你熟悉ElasticSearch,你会觉得这个过程理所当然
  • Python包依赖:Django、pyMongo、pytz、elasticsearch
  • 索引数据由index_dictionary()方法决定,当前被索引的数据有:
    • Sequence
    • Vertical
    • Video
    • HTML Block
    • 也包括课程元信息,包括课程名、课程描述、以及课程开始、结束时间
  • LMS和CMS里有好些可开启的开关和可定制的信息,我摘录几个觉得重要的:
    • 指定引擎:SEARCH_ENGINE,当前支持ElasticSearchEngine和MockSearchEngine
    • 增加可搜索信息:ELASTIC_FIELD_MAPPINGS,该对象是一个自定,可自定拓展
    • 如果你想深度定制搜索:SEARCH_INITIALIZER、SEARCH_RESULT_PROCESSOR和SEARCH_FILTER_GENERATOR会是有用的接口

进一步分析edx相关的库我们可以了解到

  • edx中写搜索相关的业务逻辑,使用elasticsearch-py来操作elasticsearch
  • edx的搜索用到了ElasticSearch,通过查阅edx/configuration,我们发现Birch、Cypress和Dogwood使用的版本都是0.90.11

###定制思路 通过前头的初步分析(注意我们还没有进入源码!我们已经初步了解课程搜索相关的设计了,可见阅读文档的重要性),我们大致知道了可以通过实现接口来实现自己的搜索逻辑,其中有三个接口是关键的,包括:

  • SEARCH_INITIALIZER,源码见LmsSearchInitializer
    • lms.lib.courseware_search.lms_search_initializer.LmsSearchInitializer
  • SEARCH_RESULT_PROCESSOR,源码见LmsSearchResultProcessor
    • lms.lib.courseware_search.lms_result_processor.LmsSearchResultProcessor
  • SEARCH_FILTER_GENERATOR,源码见LmsSearchFilterGenerator
    • lms.lib.courseware_search.lms_filter_generator.LmsSearchFilterGenerato」r

###眼见为实 我们首先将devstack跑起来(实际上是在生产环境下): sudo -u www-data /edx/bin/python.edxapp /edx/app/edxapp/edx-platform/manage.py lms runserver 0.0.0.0:5000 --settings devstack

课程搜索包括三个部分:

  • course discovery
  • all course search
  • single course search

首先是course discovery,我们据此来搜索需要的课程 course_discovery0

通过调试面板来看看实际的请求时怎样的 course_discovery1

搜索实际是一个RESTful 风格的API,以ajax的方式整合到页面里,我们稍后会跟踪这个接口

我们接着看看all course search部分 all_course_search0.png

同样打开调试面板 all_course_search1.png

我们可以看到接口的url,以POST方式发送搜索请求

随时来看看课程被的搜索(single course search)

single_course_search0.png

观察调试面板可以发现,实际是以以POST方式请求http://ip:5000/search/course-v1:edX+DemoX+Demo_Course

response为:

1
{"access_denied_count": 0, "total": 2, "max_score": 0.6223112, "took": 28, "results": [{"_type": "courseware_content", "score": 0.6223112, "_index": "courseware_index", "_score": 0.6223112, "_id": "block-v1:edX+DemoX+Demo_Course+type@html+block@2bee8c4248e842a19ba1e73ed8d426c2", "data": {"course_name": "edX Demonstration Course", "url": "/courses/course-v1:edX+DemoX+Demo_Course/jump_to/block-v1:edX+DemoX+Demo_Course+type@html+block@2bee8c4248e842a19ba1e73ed8d426c2", "excerpt": "<b>Lab</b>s and Demos Professors that create courses on edX are able to", "start_date": "2013-02-05T00:00:00+00:00", "content": {"html_content": "Labs and Demos Professors that create courses on edX are able to implement highly interactive experiences that allow you as a student to experiment using easy to use online web applications. These labs are customized to each class and subject area. We have collected a couple of the more popular lab environments here for you to experience and play with. Please be patient with yourself as you take a look around at these lab environments. You probably will not be able to answer these questions without taking a course on the topic first! ", "display_name": "Labs and Demos"}, "course": "course-v1:edX+DemoX+Demo_Course", "location": ["Example Week 2: Get Interactive", "Homework - Labs and Demos", "Labs and Demos"], "content_type": "Text", "org": "edX", "content_groups": null, "id": "block-v1:edX+DemoX+Demo_Course+type@html+block@2bee8c4248e842a19ba1e73ed8d426c2"}}, {"_type": "courseware_content", "score": 0.16797835, "_index": "courseware_index", "_score": 0.16797835, "_id": "block-v1:edX+DemoX+Demo_Course+type@html+block@Lab_5B_Mosfet_Amplifier_Experiment", "data": {"course_name": "edX Demonstration Course", "url": "/courses/course-v1:edX+DemoX+Demo_Course/jump_to/block-v1:edX+DemoX+Demo_Course+type@html+block@Lab_5B_Mosfet_Amplifier_Experiment", "excerpt": "There are no responses that need to be checked.In the <b>lab</b> below, you", "start_date": "1970-01-01T05:00:00+00:00", "content": {"html_content": "MOSFET AMPLIFIER EXPERIMENTThis demonstration is to develop your intuition about amplifiers and biasing, and to have fun with music! There are no responses that need to be checked.In the lab below, you will find:A circuit schematic of the MOSFET amplifier. You can use the sliders to the left of the circuit to control various parameters of the MOSFET and the amplifier.A plot (as a function of time) of selected voltages from the amplifier circuit. You can select the input waveform (e.g., sine wave, square wave, various types of music) by using the \\(v_\\mathrm{IN}\\) drop-down menu and the associated sliders. (The parameter \\(V_\\mathrm{MAX}\\) sets the maximum range on the plots.)The \"Play\" button which lets you listen to the selected voltage waveform as sound. Try it out!Listen to:vINvOUTvRGraph:vINvOUTvRvIN:Zero InputUnit ImpulseUnit StepSine WaveSquare WaveClassical MusicFolk MusicJazz MusicReggae MusicYour browser must support the Canvas element and have JavaScript enabled to view this tool.Your browser must support the Canvas element and have JavaScript enabled to view this tool.Experiment 1: Distorted outputBegin by selecting a sine wave input in the drop-down menu for \\(v_\\mathrm{IN}\\). Then, adjust the sliders to the baseline (default) setting shown below.Baseline setting of sliders:Peak to peak voltage: \\(v_\\mathrm{IN}=3~\\mathrm{V}\\),Frequency: \\(f = 1000~\\mathrm{Hz}\\),Supply voltage: \\(V_\\mathrm{S}=1.6~\\mathrm{V}\\),Input bias voltage: \\(V_\\mathrm{BIAS}=2.5~\\mathrm{V}\\),Pull-up resistor: \\(R = 10~\\mathrm{k}\\Omega\\),MOSFET parameter: \\(K=\\frac{1~\\mathrm{mA}}{\\mathrm{V}^2}\\),MOSFET threshold voltage: \\(V_\\mathrm{T} = 1~\\mathrm{V}\\),Vertical plot range: \\(V_\\mathrm{MAX} = 2~\\mathrm{V}\\).You should observe in the plot that with the baseline settings, the amplifier produces a distorted sine wave signal for \\(v_{OUT}\\). Next, go ahead and select one of the music signals as the input and listen to each of \\(v_{IN}\\) and \\(v_{OUT}\\), and confirm for yourself that the output sounds degraded at the chosen slider settings. You will notice that the graph now plots the music signal waveforms. Think about the reasons why the amplifier is producing a distorted output.Experiment 2: Linear regimeWe now study the amplifier's small signal behavior. Select a sine wave as the input signal. To study the small signal behavior, reduce the value of \\(v_{IN}\\) to 0.1V (peak-to-peak) by using the \\(v_{IN}\\) slider. Keeping the rest of the parameters at their baseline settings, derive an appropriate value of \\(V_{BIAS}\\) that will ensure saturation region operation for the MOSFET for the 0.1V peak-to-peak swing for \\(v_{IN}\\). Make sure to think about both positive and negative excursions of the signals.Next, use the \\(V_{BIAS}\\) slider to choose your computed value for \\(V_{BIAS}\\) and see if the observed plot of \\(v_{OUT}\\) is more-or-less distortion free. If your calculation was right, then the output will indeed be distortion free.Next, select one of the music signals as the input and listen to each of \\(v_{IN}\\) and \\(v_{OUT}\\), and confirm for yourself that the output sounds much better than in Experiment 1. Also, based on sound volume, convince yourself that \\(v_{OUT}\\) is an amplified version of \\(v_{IN}\\).Experiment 3: Your settingsNow go ahead and experiment with various other settings while listening to the music signal at \\(v_{OUT}\\). Observe the plots and listen to \\(v_{OUT}\\) as you change, for example, the bias voltage \\(V_{BIAS}\\). You will notice that the amplifier distorts the input signal when \\(V_{BIAS}\\) becomes too small, or when it becomes too large. You can also experiment with various values of \\(v_{IN}\\), \\(R_{L}\\), etc., and see how they affect the amplification and distortion.", "display_name": "Electronic Sound Experiment"}, "course": "course-v1:edX+DemoX+Demo_Course", "location": ["Example Week 2: Get Interactive", "Lesson 2 - Let's Get Interactive!", "Electronic Sound Experiment"], "content_type": "Text", "org": "edX", "content_groups": null, "id": "block-v1:edX+DemoX+Demo_Course+type@html+block@Lab_5B_Mosfet_Amplifier_Experiment"}}]}

###做个试验 使用httpie做实验,在里我们从外部发起请求,ajax的方式让人不方便直接观察数据,而且总感觉耦合在一起,不干净

http -f POST http://209.9.106.99:5000/search/course-v1:edX+DemoX+Demo_Course search_string=edx page_size=20 page_index=0 'Cookie:sessionid=pzjqyf6kdoo8jj96ng753xhr1isvstm3;csrftoken=sjPry3O5UpFFp3N3izrIVvd9ZMDEWA7V' X-CSRFToken:sjPry3O5UpFFp3N3izrIVvd9ZMDEWA7V

url的格式为 DEMAIN/search/

返回为:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
{
    "access_denied_count": 0,
    "max_score": 0.6223112,
    "results": [
        {
            "_id": "block-v1:edX+DemoX+Demo_Course+type@html+block@2bee8c4248e842a19ba1e73ed8d426c2",
            "_index": "courseware_index",
            "_score": 0.6223112,
            "_type": "courseware_content",
            "data": {
                "content": {
                    "display_name": "Labs and Demos",
                    "html_content": "Labs and Demos Professors that create courses on edX are able to implement highly interactive experiences that allow you as a student to experiment using easy to use online web applications. These labs are customized to each class and subject area. We have collected a couple of the more popular lab environments here for you to experience and play with. Please be patient with yourself as you take a look around at these lab environments. You probably will not be able to answer these questions without taking a course on the topic first! "
                },
                "content_groups": null,
                "content_type": "Text",
                "course": "course-v1:edX+DemoX+Demo_Course",
                "course_name": "edX Demonstration Course",
                "excerpt": "<b>Lab</b>s and Demos Professors that create courses on edX are able to",
                "id": "block-v1:edX+DemoX+Demo_Course+type@html+block@2bee8c4248e842a19ba1e73ed8d426c2",
                "location": [
                    "Example Week 2: Get Interactive",
                    "Homework - Labs and Demos",
                    "Labs and Demos"
                ],
                "org": "edX",
                "start_date": "2013-02-05T00:00:00+00:00",
                "url": "/courses/course-v1:edX+DemoX+Demo_Course/jump_to/block-v1:edX+DemoX+Demo_Course+type@html+block@2bee8c4248e842a19ba1e73ed8d426c2"
            },
            "score": 0.6223112
        },
        {
            "_id": "block-v1:edX+DemoX+Demo_Course+type@html+block@Lab_5B_Mosfet_Amplifier_Experiment",
            "_index": "courseware_index",
            "_score": 0.16797835,
            "_type": "courseware_content",
            "data": {
                "content": {
                    "display_name": "Electronic Sound Experiment",
                    "html_content": "MOSFET AMPLIFIER EXPERIMENTThis demonstration is to develop your intuition about amplifiers and biasing, and to have fun with music! There are no responses that need to be checked.In the lab below, you will find:A circuit schematic of the MOSFET amplifier. You can use the sliders to the left of the circuit to control various parameters of the MOSFET and the amplifier.A plot (as a function of time) of selected voltages from the amplifier circuit. You can select the input waveform (e.g., sine wave, square wave, various types of music) by using the \\(v_\\mathrm{IN}\\) drop-down menu and the associated sliders. (The parameter \\(V_\\mathrm{MAX}\\) sets the maximum range on the plots.)The \"Play\" button which lets you listen to the selected voltage waveform as sound. Try it out!Listen to:vINvOUTvRGraph:vINvOUTvRvIN:Zero InputUnit ImpulseUnit StepSine WaveSquare WaveClassical MusicFolk MusicJazz MusicReggae MusicYour browser must support the Canvas element and have JavaScript enabled to view this tool.Your browser must support the Canvas element and have JavaScript enabled to view this tool.Experiment 1: Distorted outputBegin by selecting a sine wave input in the drop-down menu for \\(v_\\mathrm{IN}\\). Then, adjust the sliders to the baseline (default) setting shown below.Baseline setting of sliders:Peak to peak voltage: \\(v_\\mathrm{IN}=3~\\mathrm{V}\\),Frequency: \\(f = 1000~\\mathrm{Hz}\\),Supply voltage: \\(V_\\mathrm{S}=1.6~\\mathrm{V}\\),Input bias voltage: \\(V_\\mathrm{BIAS}=2.5~\\mathrm{V}\\),Pull-up resistor: \\(R = 10~\\mathrm{k}\\Omega\\),MOSFET parameter: \\(K=\\frac{1~\\mathrm{mA}}{\\mathrm{V}^2}\\),MOSFET threshold voltage: \\(V_\\mathrm{T} = 1~\\mathrm{V}\\),Vertical plot range: \\(V_\\mathrm{MAX} = 2~\\mathrm{V}\\).You should observe in the plot that with the baseline settings, the amplifier produces a distorted sine wave signal for \\(v_{OUT}\\). Next, go ahead and select one of the music signals as the input and listen to each of \\(v_{IN}\\) and \\(v_{OUT}\\), and confirm for yourself that the output sounds degraded at the chosen slider settings. You will notice that the graph now plots the music signal waveforms. Think about the reasons why the amplifier is producing a distorted output.Experiment 2: Linear regimeWe now study the amplifier's small signal behavior. Select a sine wave as the input signal. To study the small signal behavior, reduce the value of \\(v_{IN}\\) to 0.1V (peak-to-peak) by using the \\(v_{IN}\\) slider. Keeping the rest of the parameters at their baseline settings, derive an appropriate value of \\(V_{BIAS}\\) that will ensure saturation region operation for the MOSFET for the 0.1V peak-to-peak swing for \\(v_{IN}\\). Make sure to think about both positive and negative excursions of the signals.Next, use the \\(V_{BIAS}\\) slider to choose your computed value for \\(V_{BIAS}\\) and see if the observed plot of \\(v_{OUT}\\) is more-or-less distortion free. If your calculation was right, then the output will indeed be distortion free.Next, select one of the music signals as the input and listen to each of \\(v_{IN}\\) and \\(v_{OUT}\\), and confirm for yourself that the output sounds much better than in Experiment 1. Also, based on sound volume, convince yourself that \\(v_{OUT}\\) is an amplified version of \\(v_{IN}\\).Experiment 3: Your settingsNow go ahead and experiment with various other settings while listening to the music signal at \\(v_{OUT}\\). Observe the plots and listen to \\(v_{OUT}\\) as you change, for example, the bias voltage \\(V_{BIAS}\\). You will notice that the amplifier distorts the input signal when \\(V_{BIAS}\\) becomes too small, or when it becomes too large. You can also experiment with various values of \\(v_{IN}\\), \\(R_{L}\\), etc., and see how they affect the amplification and distortion."
                },
                "content_groups": null,
                "content_type": "Text",
                "course": "course-v1:edX+DemoX+Demo_Course",
                "course_name": "edX Demonstration Course",
                "excerpt": "There are no responses that need to be checked.In the <b>lab</b> below, you",
                "id": "block-v1:edX+DemoX+Demo_Course+type@html+block@Lab_5B_Mosfet_Amplifier_Experiment",
                "location": [
                    "Example Week 2: Get Interactive",
                    "Lesson 2 - Let's Get Interactive!",
                    "Electronic Sound Experiment"
                ],
                "org": "edX",
                "start_date": "1970-01-01T05:00:00+00:00",
                "url": "/courses/course-v1:edX+DemoX+Demo_Course/jump_to/block-v1:edX+DemoX+Demo_Course+type@html+block@Lab_5B_Mosfet_Amplifier_Experiment"
            },
            "score": 0.16797835
        }
    ],
    "took": 15,
    "total": 2
}

同理在课程首页的搜索和Dashboard里的搜索也类似

以上这个技巧对我们做调试十分有帮助,因为输出在命令行所以我们可以尽情使用grep和jq等工具来筛选信息。只要数据能流向命令行,我们的linux工具箱就将发挥作用,工具的组合威力每次都能让我大吃一惊。管道大概是Unix工具箱最美妙的特性之一

评测发现搜索必须是个词,好比搜demo课程,搜索edx可以搜到内容,而搜索e则无法搜索到,这应该和分词有关。在edx本土化(汉化)的过程中这个问题应该也会存在

在课程内部进行搜索,可以检索中文

###分析源码 ####类继承关系

waiting… 具体的源码分析留到之后有时间再折腾。今天大体上已经将search服务干净地分离出来了,据此可以做许多有趣的东西

###Elasticsearch Elasticsearch 是一个分布式可扩展的实时搜索和分析引擎。它能帮助你搜索、分析和浏览数据。edx中的相关搜索都是基于它写的

####资源 * Elasticsearch 权威指南(中文版) * Elasticsearch 权威指南 * 使用Python进行Elasticsearch数据索引